event
PhD Defense by Bolin Lai
Primary tabs
Title: Multimodal Human Behavior Modeling: From Understanding to Generation
Date: Tuesday, March 31st
Time: 3:00-5:00pm ET
Remote Link: https://gatech.zoom.us/j/96560653822?pwd=PKFEAdNbnxP79Qua7qddx0MZ6qeIxo.1&from=addon
Bolin Lai
Machine Learning PhD Student
School of Electrical and Computer Engineering
Georgia Institute of Technology
Committee
1 Dr. James Rehg (Advisor, CS, UIUC)
2 Dr. Zsolt Kira (Advisor, IC, Georgia Tech)
3 Dr. James Hays (IC, Georgia Tech)
4 Dr. Judy Hoffman (IC, Georgia Tech)
5 Dr. Humphrey Shi (IC, Georgia, Tech)
Abstract
Human behavior modeling is a critical step to develop AI agents that can assist us in various tasks. In contrast with learning objects, scenes and textures, human behaviors are inherently purposeful, guided by underlying intentions and goals. Additionally, human behaviors involve precise and adaptive interactions with the environment, characterized by fine-grained and nuanced control. The two key differences require innovative approaches for AI models to understand our intentions in the behaviors, and capture the nuance of our actions in different tasks. In my thesis proposal, I will elaborate my research on leveraging multimodal inputs to capture the underlying intentions and enable precise controllability on human actions in both understanding and generation problems. My thesis includes four chapters: audio-visual gaze anticipation, multimodal social behavior understanding, text-guided egocentric action generation, and training-free text-image conditioned action generation. The ultimate goal of my research is to enable AI models to better understand and interact with people, paving the way towards human-centric artificial intelligence.
Groups
Status
- Workflow status: Published
- Created by: Tatianna Richardson
- Created: 03/24/2026
- Modified By: Tatianna Richardson
- Modified: 03/24/2026
Categories
Keywords
User Data
Target Audience