event

PhD Defense by Maks Sorokin

Primary tabs

Title: "Levers of Robot Learning: From Privileged Training to Vision-Based Deployment"

 

Date: Monday, April 13, 2026

Time: 1:00 - 3:00 PM ET

Location: Remote (https://gatech.zoom.us/j/99520988331)

 

Maks Sorokin

Robotics Ph.D. Candidate

School of Interactive Computing

Georgia Institute of Technology

https://itsmaks.com/

 

Committee

Dr. Sehoon Ha (Advisor) - School of Interactive Computing, Georgia Institute of Technology

Dr. Danfei Xu - School of Interactive Computing, Georgia Institute of Technology

Dr. Sonia Chernova - School of Interactive Computing, Georgia Institute of Technology

Dr. C. Karen Liu - Department of Computer Science, Stanford University

Dr. Jie Tan - Director, Google DeepMind

Dr. Simon Le Cleac'H - Research Scientist, RAI Institute

 

Abstract

Robot learning systems typically train with privileged information that is unavailable when the robot deploys with onboard cameras: bird's-eye-view maps, ground-truth object positions, full environment state. This thesis develops four systems spanning navigation, robot design, and whole-body manipulation, and identifies in each case the design choice in representation, evaluation, or distillation that enabled deployment with onboard vision.

A quadruped navigated 3.2 km of urban sidewalks using penultimate features from a pre-trained segmentation network as its visual input, achieving 83% real-world success where raw images achieved 25% and semantic labels 7%. A mobile manipulator's morphology was optimized by evaluating candidates with onboard cameras rather than privileged state, producing designs that achieved 80% success and required 25x less training data than a human-expert baseline. A navigation policy trained in abstract colored-tile worlds transferred zero-shot to photorealistic simulation and real hardware using sparse boundary points as its only perception input (87-100% vs. 0-48% for dense representations). A Spot quadruped robot with an arm learned to push, roll, and upright a 15 kg car tire using hierarchical RL, and cascaded distillation transferred the resulting policies from privileged state to onboard perception.

Status

  • Workflow status: Published
  • Created by: Tatianna Richardson
  • Created: 04/02/2026
  • Modified By: Tatianna Richardson
  • Modified: 04/02/2026

Categories

Keywords

User Data

Target Audience