event
PhD Proposal by Shuo Cheng
Primary tabs
Title: Robot Skill Representations for Long-Horizon Task Learning and Deployment
Shuo Cheng
Ph.D. Student in Computer Science
School of Interactive Computing
Georgia Institute of Technology
https://sites.google.com/view/shuocheng
Date: Monday, Nov 17th, 2025
Time: 11:00 AM - 12:30 PM EST
Location: Klaus 1212, Zoom Link
Committee:
Dr. Danfei Xu (Advisor) – School of Interactive Computing, Georgia Institute of Technology
Dr. Sehoon Ha – School of Interactive Computing, Georgia Institute of Technology
Dr. Harish Ravichandar – School of Interactive Computing, Georgia Institute of Technology
Abstract
Robots must acquire versatile and generalizable skills to operate effectively in complex, unstructured environments. This thesis investigates how to enable such capabilities through two complementary directions: (1) structural skill representations for scalable and generalizable learning, and (2) data generation and co-training frameworks for learning reactive, deployable policies. The first part develops representations that embed structural inductive biases to support efficient skill acquisition and reuse. LEAGUE, which progressively learns neuro-symbolic reinforcement learning policies within task and motion planning systems, facilitates compositional skill learning over long horizons and earned a Best Paper Honorable Mention at IEEE RA-L. NOD-TAMP enables one-shot skill adaptation across novel geometries and configurations through learned neural object descriptors. The second part introduces robotic data generation and policy distillation frameworks for large-scale, cross-domain learning. OT-Sim2Real leverages the learned skill representations to synthesize diverse simulation demonstrations and employs a co-training strategy that improves policy generalization beyond real-world data coverage—bridging the gap between simulation and deployment.
Proposed Work: Building on these foundations, future research will investigate how large-scale human videos can be leveraged to learn composable motion priors that enable scalable learning of robot manipulation systems that generalize across diverse tasks and embodiments. A factor graph representation is proposed to decompose the manipulation process into structured subcomponents, facilitating efficient learning of motion samplers from human videos. In addition, a diffusion-based steering strategy will be explored to guide the sampling process toward trajectories that satisfy task-specific constraints effectively and efficiently.
Groups
Status
- Workflow Status:Published
- Created By:Tatianna Richardson
- Created:11/06/2025
- Modified By:Tatianna Richardson
- Modified:11/06/2025
Categories
Keywords
Target Audience