event

PhD Defense by Bolin Lai

Primary tabs

Title: Multimodal Human Behavior Modeling: From Understanding to Generation

 

Date: Tuesday, March 31st

Time: 3:00-5:00pm ET

Remote Link: https://gatech.zoom.us/j/96560653822?pwd=PKFEAdNbnxP79Qua7qddx0MZ6qeIxo.1&from=addon

 

Bolin Lai

Machine Learning PhD Student

School of Electrical and Computer Engineering
Georgia Institute of Technology

 

Committee

1 Dr. James Rehg (Advisor, CS, UIUC)

2 Dr. Zsolt Kira (Advisor, IC, Georgia Tech)

3 Dr. James Hays (IC, Georgia Tech)

4 Dr. Judy Hoffman (IC, Georgia Tech)

5 Dr. Humphrey Shi (IC, Georgia, Tech)

 

Abstract

Human behavior modeling is a critical step to develop AI agents that can assist us in various tasks. In contrast with learning objects, scenes and textures, human behaviors are inherently purposeful, guided by underlying intentions and goals. Additionally, human behaviors involve precise and adaptive interactions with the environment, characterized by fine-grained and nuanced control. The two key differences require innovative approaches for AI models to understand our intentions in the behaviors, and capture the nuance of our actions in different tasks. In my thesis proposal, I will elaborate my research on leveraging multimodal inputs to capture the underlying intentions and enable precise controllability on human actions in both understanding and generation problems. My thesis includes four chapters: audio-visual gaze anticipation, multimodal social behavior understanding, text-guided egocentric action generation, and training-free text-image conditioned action generation. The ultimate goal of my research is to enable AI models to better understand and interact with people, paving the way towards human-centric artificial intelligence.

 

 

 

Status

  • Workflow status: Published
  • Created by: Tatianna Richardson
  • Created: 03/24/2026
  • Modified By: Tatianna Richardson
  • Modified: 03/24/2026

Categories

Keywords

User Data

Target Audience