{"680585":{"#nid":"680585","#data":{"type":"news","title":"New Algorithm Teaches Robots Through Human Perspective","body":[{"value":"\u003Cp\u003EA new data creation paradigm and algorithmic breakthrough from Georgia Tech has laid the groundwork for humanoid assistive robots to help with laundry, dishwashing, and other household chores. The framework enables these robots to learn new skills by mimicking actions from first-person videos of everyday activities.\u003C\/p\u003E\u003Cp\u003ECurrent training methods limit robots from being produced at the necessary scale to put a robot in every home, said \u003Cstrong\u003ESimar\u003C\/strong\u003E \u003Cstrong\u003EKareer\u003C\/strong\u003E, a Ph.D. student in the School of Interactive Computing.\u003C\/p\u003E\u003Cp\u003E\u201cTraditionally, collecting data for robotics means creating demonstration data,\u201d Kareer said. \u201cYou operate the robot\u2019s joints with a controller to move it and achieve the task you want, and you do this hundreds of times while recording sensor data, then train your models. This is slow and difficult. The only way to break that cycle is to detach the data collection from the robot itself.\u201d\u003C\/p\u003E\u003Cp\u003E\u003Ca href=\u0022https:\/\/youtu.be\/ckGUsdFX9pU?si=7qmGR1D5P_iPAVMt\u0022\u003E\u003Cstrong\u003E[VIDEO: Meta Shares EgoMimic Case Study Video]\u003C\/strong\u003E\u003C\/a\u003E\u003C\/p\u003E\u003Cp\u003EOther fields, such as computer vision and natural language processing (NLP), already leverage training data passively culled from the internet to create powerful generative AI and large-language models (LLMs).\u003C\/p\u003E\u003Cp\u003EMany roboticists, however, have shifted toward interventions that allow individual users to teach their robots how to perform tasks. Kareer believes a similar source of passive data can be established to enable practical generalized training that scales the production of humanoid robots.\u003C\/p\u003E\u003Cp\u003EThis is why Kareer collaborated with School of IC Assistant Professor \u003Cstrong\u003EDanfei\u003C\/strong\u003E \u003Cstrong\u003EXu\u003C\/strong\u003E and his \u003Ca href=\u0022https:\/\/rl2.cc.gatech.edu\/\u0022\u003E\u003Cstrong\u003ERobot Learning and Reasoning Lab\u003C\/strong\u003E\u003C\/a\u003E to develop EgoMimic, an algorithmic framework that leverages data from egocentric videos.\u003C\/p\u003E\u003Cp\u003EMeta\u2019s Ego4D dataset inspired Kareer\u2019s project. The benchmark dataset, released in 2023, consists of first-person videos of humans performing daily activities. This open-source data set trains AI models from a first-person human perspective.\u003C\/p\u003E\u003Cp\u003E\u201cWhen I looked at Ego4D, I saw a dataset that\u2019s the same as all the large robot datasets we\u2019re trying to collect, except it\u2019s with humans,\u201d Kareer said. \u201cYou just wear a pair of glasses, and you go do things. It doesn\u2019t need to come from the robot. It should come from something more scalable and passively generated, which is us.\u201d\u003C\/p\u003E\u003Cp\u003EKareer acquired a pair of Meta\u2019s Project Aria research glasses, which contain a rich sensor suite and can record video from a first-person perspective through external RGB and SLAM cameras.\u003C\/p\u003E\u003Cp\u003EKareer recorded himself folding a shirt while wearing the glasses and repeated the process. He did the same with other tasks such as placing a toy in a bowl and groceries into a bag. Then, he constructed a humanoid robot with pincers for hands and attached the glasses to the top to mimic a first-person viewpoint.\u003C\/p\u003E\u003Cp\u003EThe robot performed each task repeatedly for two hours. Kareer said building a traditional training algorithm would take days of teleoperating and recording robot sensory data. For his project, he only needed to gather a baseline of sensory data to ensure performance improvement.\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EKareer bridged the gap between the two training sets with the EgoMimic algorithm. The robot\u2019s task performance rating increased by as much as 400% among various tasks with just 90 minutes of recorded footage. It also showed the ability to perform these tasks in unseen environments.\u003C\/p\u003E\u003Cp\u003EIf enough people wear Aria glasses or other smart glasses while performing daily tasks, it can create the passive data bank needed to train robots on a massive scale.\u003C\/p\u003E\u003Cp\u003EThis type of data collection can enable nearly endless possibilities for roboticists to help humans achieve more in their everyday lives. Humanoid robots can be produced and trained at an industrial level and be able to perform tasks the same way humans do.\u003C\/p\u003E\u003Cp\u003E\u201cThis work is most applicable to jobs that you can get a humanoid robot to do,\u201d Kareer said. \u201cIn whatever industry we are allowed to collect egocentric data, we can develop humanoid robots.\u201d\u003C\/p\u003E\u003Cp\u003EKareer will present his paper on EgoMimic at the 2025 IEEE Engineers\u2019 International Conference on Robotics and Automation (ICRA), which will take place from May 19 to 23 in Atlanta. The paper was co-authored by Xu and School of IC Assistant Professor \u003Cstrong\u003EJudy\u003C\/strong\u003E \u003Cstrong\u003EHoffman\u003C\/strong\u003E, fellow Tech students \u003Cstrong\u003EDhruv\u003C\/strong\u003E \u003Cstrong\u003EPatel\u003C\/strong\u003E, \u003Cstrong\u003ERyan\u003C\/strong\u003E \u003Cstrong\u003EPunamiya\u003C\/strong\u003E, \u003Cstrong\u003EPranay\u003C\/strong\u003E \u003Cstrong\u003EMathur\u003C\/strong\u003E, and \u003Cstrong\u003EShuo\u003C\/strong\u003E \u003Cstrong\u003ECheng\u003C\/strong\u003E, and \u003Cstrong\u003EChen\u003C\/strong\u003E \u003Cstrong\u003EWang\u003C\/strong\u003E, a Ph.D. student at Stanford.\u003C\/p\u003E","summary":"","format":"limited_html"}],"field_subtitle":"","field_summary":[{"value":"\u003Cp\u003EInspired by a dataset created by Meta, a Georgia Tech Ph.D. student is bringing a new perspective to robotics training.\u003C\/p\u003E","format":"limited_html"}],"field_summary_sentence":[{"value":"Inspired by a dataset created by Meta, a Georgia Tech Ph.D. student is bringing a new perspective to robotics training."}],"uid":"32045","created_gmt":"2025-02-19 15:00:13","changed_gmt":"2025-02-19 20:20:46","author":"Ben Snedeker","boilerplate_text":"","field_publication":"","field_article_url":"","location":"Atlanta, GA","dateline":{"date":"2025-02-19T00:00:00-05:00","iso_date":"2025-02-19T00:00:00-05:00","tz":"America\/New_York"},"extras":[],"hg_media":{"676332":{"id":"676332","type":"image","title":"Georgia Tech Ph.D. student Simar Kareer is revolutionizing how robots are trained.","body":null,"created":"1739977597","gmt_created":"2025-02-19 15:06:37","changed":"1739977597","gmt_changed":"2025-02-19 15:06:37","alt":"Georgia Tech Ph.D. student Simar Kareer is revolutionizing how robots are trained.","file":{"fid":"260101","name":"Simar Kareer_86A7668 (1).jpg","image_path":"\/sites\/default\/files\/2025\/02\/19\/Simar%20Kareer_86A7668%20%281%29.jpg","image_full_path":"http:\/\/hg.gatech.edu\/\/sites\/default\/files\/2025\/02\/19\/Simar%20Kareer_86A7668%20%281%29.jpg","mime":"image\/jpeg","size":118241,"path_740":"http:\/\/hg.gatech.edu\/sites\/default\/files\/styles\/740xx_scale\/public\/2025\/02\/19\/Simar%20Kareer_86A7668%20%281%29.jpg?itok=jakxURZ2"}}},"media_ids":["676332"],"related_links":[{"url":"https:\/\/youtu.be\/ckGUsdFX9pU?si=b-J_aUjaDNpMpq2b","title":"Project Aria Case Study: Introducing EgoMimic by the Georgia Institute of Technology"}],"groups":[{"id":"47223","name":"College of Computing"},{"id":"1188","name":"Research Horizons"},{"id":"50876","name":"School of Interactive Computing"}],"categories":[{"id":"135","name":"Research"},{"id":"152","name":"Robotics"}],"keywords":[{"id":"10199","name":"Daily Digest"},{"id":"181991","name":"Georgia Tech News Center"},{"id":"187915","name":"go-researchnews"}],"core_research_areas":[{"id":"39521","name":"Robotics"}],"news_room_topics":[],"event_categories":[],"invited_audience":[],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[{"value":"\u003Cp\u003EBen Snedeker, Communication Manager\u003C\/p\u003E\u003Cp\u003EGeorgia Tech College of Computing\u003C\/p\u003E","format":"limited_html"}],"email":[],"slides":[],"orientation":[],"userdata":""}}}