event
PhD Defense by Seongmin Lee
Primary tabs
Title: Visual and Algorithmic Explanations to Fortify AI Safety
Date: Monday, May 18, 2026
Time: 1PM to 3PM Eastern Time (US)
Location: Coda 114 (1st floor conference room; just walk in, no special access needed)
Virtual Meeting: https://gatech.zoom.us/j/91061621484
Seongmin Lee
CS Ph.D. Candidate
School of Computational Science and Engineering
College of Computing
Georgia Institute of Technology
Committee:
Dr. Duen Horng (Polo) Chau - Advisor, Georgia Tech, School of Computational Science & Engineering
Dr. Alex Endert - Georgia Tech, School of Interactive Computing
Dr. Chao Zhang - Georgia Tech, School of Computational Science & Engineering
Dr. Judy Hoffman - University of California, Irvine, Donald Bren School of Information and Computer Sciences
Dr. Oliver Brdiczka - Adobe, Adobe Firefly
Abstract:
As modern AI systems, such as diffusion-based generative models or large language models (LLMs), continue to grow in scale, complexity, and societal impact, understanding and mitigating their risks has become increasingly urgent yet challenging due to their black-box nature.
My thesis addresses this critical challenge by developing novel visualizations and algorithms that help people understand the reasons and mechanisms behind AI behaviors, and take actionable steps to mitigate risks. Our work is organized into three complementary
thrusts:
(1) Attribute risks. We begin with investigating how to uncover the underlying causes of AI risks. We present the first survey bridging LLM interpretation and safety. Building on insights from our survey that training data can offer intuitive explanations for LLM generations, we develop LLM Attributor, which visually reveals the training data sources behind LLM-generated text, offering a novel way to diagnose unsafe outputs.
(2) Explain failure. While interpretation algorithms reveal causes of AI risks, their impact depends on how effectively they are communicated. To fill this gap, we introduce interactive visualizations that explain complex model mechanisms to broad audiences. Diffusion Explainer helps non-experts understand modern generative AI, outperforming traditional tools in 56-participant user studies. Extending visualization to non-generative models, VisCUIT empowers experts to explore the mechanisms behind failures of classifiers.
(3) Guide mitigation. To reduce risks, we introduce CRAYON, simple yet powerful algorithms that help classifiers overcome reliance on irrelevant features using yes-no annotations; experiments with large-scale human evaluations with 5,893 participants show its superiority over 12 methods across three datasets — even those requiring complex annotations. Extending to modern LLMs, we develop SHINE algorithm to determine whether hallucinations stem from limited model knowledge or flawed generation strategies. SHINE effectively differentiates faithful text and two types of hallucinations across three LLMs, and outperforms seven hallucination detection methods across four datasets and four LLMs.
My PhD research develops practical, innovative, human-centered solutions for research problems grounded in real-world needs, from advancing AI education to improving LLM safety, leveraging close partnership with leading companies like Google, Adobe, Cisco, JPMorgan Chase, ADP, and Avast. My work has made significant impacts across academia, industry, and society: Diffusion Explainer and its followup work Transformer Explainer have reached over 638k users in 210+ countries and have been integrated into university AI courses (e.g., MIT, Columbia). My research has been recognized with honors including the Korean Honor Scholarship, NCWIT AiC Collegiate Award Finalist, and IEEE VIS Best Poster Award.
Groups
Status
- Workflow status: Published
- Created by: Tatianna Richardson
- Created: 05/05/2026
- Modified By: Tatianna Richardson
- Modified: 05/05/2026
Categories
Keywords
User Data
Target Audience