event

School of CSE Seminar Series: Manling Li

Primary tabs

Speaker: Manling Li, assistant professor at Northwestern University
Date and Time: February 24, 11:00 a.m. - 12:00 p.m.
Location: Coda 114
Host: Bo Dai

Title: Toward Foundation Agents: How Multimodal Models Learn (and Fail to Learn) the Physical World

Abstract: Today’s multimodal models are often trained with a brute-force “align everything” recipe, yet it is still unclear how cross-modal intelligence can emerge. We argue the key question is mechanistic: how can models go beyond static alignment annotations to learn from physical-world interaction and support goal-directed decision making? We systematically study multimodal learning through the MDP agent loop: state estimation, world modeling for planning, and control for safety. First, we open up the black box, and intervene inside embeddings to reveal how geometry is lost, and design ways to retain geometric structure. Second, we inject world-model priors to teach dynamics through RAGEN/VAGEN, enabling multi-step planning rather than token matching. Third, we introduce ODE-Steer for safe agents, which steers internal activations into “safe zones” where reasoning stays reliable and controllable. Lastly, we lay out the future that true multimodal intelligence requires more than aligning tokens; it requires aligning the internal mechanisms of the model with the geometry of the world.

Bio: Manling Li is an Assistant Professor at Northwestern University and an Amazon Scholar. She was a postdoc at Stanford University, and obtained Ph.D. degree in Computer Science at University of Illinois Urbana-Champaign in 2023. She works on Reasoning, Planning and Compositionality, in the intersection of Language, Vision, and Robotics. Her work has been recognized as ACL 2025 Inaugural Dissertation Award Honorable Mention, MIT Tech Review Innovators Under 35, ACL’24 Outstanding Paper Award, NAACL'21 Best Demo Paper Award, ACL'20 Best Demo Paper Award, Microsoft Research PhD Fellowship, EE CS Rising Star, etc. She served as virtual chairs of ACL 25, publication chairs at NAACL 25, demo chairs at EMNLP 24, etc. Additional information is available at https://limanling.github.io/.

Status

  • Workflow status: Published
  • Created by: Bryant Wine
  • Created: 02/20/2026
  • Modified By: Bryant Wine
  • Modified: 02/20/2026

User Data