PhD Defense by Adam Coscia

Title: Detecting and Mitigating Pedagogical Risks in Large Language Models With Visual Analytics

Adam Coscia

Ph.D. Candidate in Human-Centered Computing

School of Interactive Computing, College of Computing

Georgia Institute of Technology

https://adamcoscia.com/

Date: Monday, April 27, 2026

Time: 10AM - 12PM Eastern time (U.S.)

Location: TSRB room 334 (VIS Lab) – just walk in, show your BuzzCard to the concierge if asked

Virtual Meeting (hybrid): https://gatech.zoom.us/j/3100254613?pwd=QWlKajNkOWlPbWkxR3N5MkZsTE9FZz09

Committee

Dr. Alex Endert - School of Interactive Computing, Georgia Institute of Technology

Dr. Duen Horng (Polo) Chau - School of Computational Science & Engineering, Georgia Institute of Technology

Dr. Cindy Xiong Bearfield - School of Interactive Computing, Georgia Institute of Technology

Dr. Yalong Yang - School of Interactive Computing, Georgia Institute of Technology

Dr. Scott Crossley - Department of Special Education, Vanderbilt University

Abstract

The advent of powerful new large language models (LLMs) has catalyzed a surge in LLM-powered educational technologies, enabling transformational advances that can empower learner agency, deliver personalized study materials, and promote active learning. Yet persistent pedagogical risks, from bias and hallucinations to unfair grading and misalignment with instructional goals, highlight a critical technology gap. Existing tools for selecting, fine-tuning, and evaluating LLMs are not designed to address the unique challenges of educational contexts, making it difficult for data scientists to detect and mitigate the potential pedagogical risks of prematurely deploying LLM-powered educational technology.

This thesis addresses this gap by introducing human-in-the-loop visual analytics approaches that integrate automated analysis with interactive visualizations, enabling data scientists to more effectively discover, understand, and address pedagogical risks throughout the LLM development lifecycle. We organize these contributions under four main thrusts:

(1) Uncovering Harmful Biases and Stereotypes in LLM Selection: We introduce KnowledgeVIS, a visual analytics system that enables interactive exploration of fill-in-the-blank prompts to surface latent biases, stereotypes, and learned associations in foundation LLMs. By supporting comparative analysis across models, KnowledgeVIS empowers data scientists to make more informed model selection decisions prior to deployment, revealing risks that are often obscured by traditional benchmark-driven evaluation.

(2) Diagnosing LLM Decision-Making During Fine-Tuning: We present iScore, a human-in-the-loop visual analytics system for interpreting how LLMs make scoring decisions in educational tasks such as automatic writing assessment. By linking internal model representations with input perturbations and output variations, iScore enables data scientists to diagnose unintended decision-making criteria, uncover model sensitivities, and iteratively refine fine-tuning strategies to better align with pedagogical objectives.

(3) Measuring and Visualizing LLM Trustworthiness in Evaluation: We propose a novel framework for operationalizing LLM trustworthiness as a set of interpretable, pedagogically grounded metrics, coupled with visualizations that make these risks traceable within model outputs. Through a co-designed evaluation workflow, we demonstrate how these metrics improve the consistency, transparency, and defensibility of expert decision-making, while surfacing complex trade-offs that cannot be captured by traditional performance measures alone.

(4) Broadening Access to Visual Analytics in Education: We contribute TrustyVis, an open-source Python library that encapsulates the visual analytics techniques developed in this thesis into modular, reusable components. By lowering the barrier to integrating interactive visualizations into existing machine learning workflows, TrustyVis enables scalable and practical adoption of human-in-the-loop approaches for evaluating and improving LLM-powered educational systems.

Through a multi-year longitudinal co-design process with data scientists as well as several deployments and integrations into real-world educational settings, this thesis demonstrates how human-in-the-loop visual analytics can transform opaque LLM pipelines into transparent, iterative, and trustworthy development processes, ultimately supporting the responsible integration of LLMs into high-stakes learning environments. The outcomes of this thesis have been disseminated through multiple publications in top journals and conferences, advancing the state of the art in visual analytics, human-computer interaction, artificial intelligence, and educational technology by establishing new methods, systems, and design principles for making LLM behavior interpretable in pedagogical contexts.

Media

No media selected

Summary

Detecting and Mitigating Pedagogical Risks in Large Language Models With Visual Analytics

Details

Monday

Apr 27 2026

10:00am - 12:00pm

Location: TSRB room 334 (VIS Lab)

In campus calendar: No

Sidebar Content

No sidebar content

Groups

Graduate Studies

Status

Workflow status: Published
Created by: Tatianna Richardson
Created: 04/06/2026
Modified By: Tatianna Richardson
Modified: 04/06/2026

Mercury (Hg)