event

PhD Defense by Seongmin Lee

Primary tabs

Title: Visual and Algorithmic Explanations to Fortify AI Safety

 

Date: Monday, May 18, 2026

Time: 1PM to 3PM Eastern Time (US)

Location: Coda 114 (1st floor conference room; just walk in, no special access needed)

Virtual Meetinghttps://gatech.zoom.us/j/91061621484 

 

Seongmin Lee

CS Ph.D. Candidate

School of Computational Science and Engineering

College of Computing

Georgia Institute of Technology

https://seongmin.xyz/ 

 

Committee:

Dr. Duen Horng (Polo) Chau - Advisor, Georgia Tech, School of Computational Science & Engineering

Dr. Alex Endert - Georgia Tech, School of Interactive Computing

Dr. Chao Zhang - Georgia Tech, School of Computational Science & Engineering

Dr. Judy Hoffman - University of California, Irvine, Donald Bren School of Information and Computer Sciences 

Dr. Oliver Brdiczka - Adobe, Adobe Firefly

 

Abstract:

As modern AI systems, such as diffusion-based generative models or large language models (LLMs), continue to grow in scale, complexity, and societal impact, understanding and mitigating their risks has become increasingly urgent yet challenging due to their black-box nature.

 

My thesis addresses this critical challenge by developing novel visualizations and algorithms that help people understand the reasons and mechanisms behind AI behaviors, and take actionable steps to mitigate risks. Our work is organized into three complementary

thrusts:

 

(1) Attribute risks. We begin with investigating how to uncover the underlying causes of AI risks. We present the first survey bridging LLM interpretation and safety. Building on insights from our survey that training data can offer intuitive explanations for LLM generations, we develop LLM Attributor, which visually reveals the training data sources behind LLM-generated text, offering a novel way to diagnose unsafe outputs.

 

(2) Explain failure. While interpretation algorithms reveal causes of AI risks, their impact depends on how effectively they are communicated. To fill this gap, we introduce interactive visualizations that explain complex model mechanisms to broad audiences. Diffusion Explainer helps non-experts understand modern generative AI, outperforming traditional tools in 56-participant user studies. Extending visualization to non-generative models, VisCUIT empowers experts to explore the mechanisms behind failures of classifiers.

 

(3) Guide mitigation. To reduce risks, we introduce CRAYON, simple yet powerful algorithms that help classifiers overcome reliance on irrelevant features using yes-no annotations; experiments with large-scale human evaluations with 5,893 participants show its superiority over 12 methods across three datasets — even those requiring complex annotations. Extending to modern LLMs, we develop SHINE algorithm to determine whether hallucinations stem from limited model knowledge or flawed generation strategies. SHINE effectively differentiates faithful text and two types of hallucinations across three LLMs, and outperforms seven hallucination detection methods across four datasets and four LLMs.

 

My PhD research develops practical, innovative, human-centered solutions for research problems grounded in real-world needs, from advancing AI education to improving LLM safety, leveraging close partnership with leading companies like Google, Adobe, Cisco, JPMorgan Chase, ADP, and Avast. My work has made significant impacts across academia, industry, and society: Diffusion Explainer and its followup work Transformer Explainer have reached over 638k users in 210+ countries and have been integrated into university AI courses (e.g., MIT, Columbia). My research has been recognized with honors including the Korean Honor Scholarship, NCWIT AiC Collegiate Award Finalist, and IEEE VIS Best Poster Award.

Status

  • Workflow status: Published
  • Created by: Tatianna Richardson
  • Created: 05/05/2026
  • Modified By: Tatianna Richardson
  • Modified: 05/05/2026

Categories

Keywords

User Data

Target Audience