{"633612":{"#nid":"633612","#data":{"type":"event","title":"PhD Proposal by Himanshu Sahni","body":[{"value":"\u003Cp\u003E\u003Cstrong\u003ETitle:\u003C\/strong\u003E Hallucinating agent experience to speed up reinforcement learning\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EHimanshu Sahni\u003C\/p\u003E\r\n\r\n\u003Cp\u003EPh.D. student in Computer Science\u003C\/p\u003E\r\n\r\n\u003Cp\u003ESchool of Interactive Computing\u003C\/p\u003E\r\n\r\n\u003Cp\u003ECollege of Computing\u003C\/p\u003E\r\n\r\n\u003Cp\u003EGeorgia Institute of Technology\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EDate:\u003C\/strong\u003E Tuesday, March 17, 2020\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003ETime:\u003C\/strong\u003E\u0026nbsp;12:45pm-2:30 PM EST\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003ELocation:\u0026nbsp;\u003C\/strong\u003E\u003Ca href=\u0022https:\/\/bluejeans.com\/536486204\u0022 id=\u0022LPlnk989365\u0022\u003Ehttps:\/\/bluejeans.com\/536486204\u003C\/a\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003EMeeting ID: 536 486 204\u003C\/p\u003E\r\n\r\n\u003Cp\u003E**Note: this proposal is remote-only due to the institute\u0026#39;s guidelines on COVID-19**\u003C\/p\u003E\r\n\r\n\u003Cp\u003E---\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003ECommittee:\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDr. Charles Isbell (Advisor), School of Interactive Computing, Georgia Institute of Technology\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDr. Mark Riedl, School of Interactive Computing, Georgia Institute of Technology\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDr. Judy Hoffman, School of Interactive Computing, Georgia Institute of Technology\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDr. Dhruv Batra, School of Interactive Computing, Georgia Institute of Technology\u003C\/p\u003E\r\n\r\n\u003Cp\u003E---\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003ESummary:\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003EReinforcement learning has seen widespread success recently. Yet, training RL agents remains prohibitively expensive in terms of number of environment interactions. The overall aim of this research is to significantly reduce sample complexity required for training RL agents, making it easier to deploy them in the real world and quickly learn from experience. This proposal focuses on learning how to alter experience collected by the agent during exploration, rather than the learning algorithm itself. We define realistic alterations, those permitted by the environment state space and dynamics, to the trajectory of an agent as hallucinations. I will demonstrate that by presenting hallucinated data to off-the-shelf RL algorithms, we can significantly improve their sample efficiency.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EAs contributions, I will outline three ways of altering agent experience to benefit learning. The first uses hallucinations to train a representation of the state of the environment when the agent has a limited field of view. Key components of this system are a short term memory architecture for such environments and an adversarially trained attention controller. The second contribution is a method to alter visual trajectories in hindsight using learned hallucinations of goal images. Combined with Hindsight Experience Replay, this significantly speeds up reinforcement learning as shown in two navigation based domains. The third proposed contribution outlines how to hallucinate realistic subgoals using state-based value functions.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThe contributions above serve to support the thesis statement: We can alter the distribution of an agent\u0026#39;s future experiences by\u003C\/p\u003E\r\n","summary":null,"format":"limited_html"}],"field_subtitle":"","field_summary":"","field_summary_sentence":[{"value":"Hallucinating agent experience to speed up reinforcement learning"}],"uid":"27707","created_gmt":"2020-03-16 18:22:38","changed_gmt":"2020-03-16 23:21:38","author":"Tatianna Richardson","boilerplate_text":"","field_publication":"","field_article_url":"","field_event_time":{"event_time_start":"2020-03-17T13:30:00-04:00","event_time_end":"2020-03-17T15:30:00-04:00","event_time_end_last":"2020-03-17T15:30:00-04:00","gmt_time_start":"2020-03-17 17:30:00","gmt_time_end":"2020-03-17 19:30:00","gmt_time_end_last":"2020-03-17 19:30:00","rrule":null,"timezone":"America\/New_York"},"extras":[],"related_links":[{"url":"https:\/\/bluejeans.com\/536486204","title":"BlueJeans"}],"groups":[{"id":"221981","name":"Graduate Studies"}],"categories":[],"keywords":[{"id":"102851","name":"Phd proposal"}],"core_research_areas":[],"news_room_topics":[],"event_categories":[{"id":"1788","name":"Other\/Miscellaneous"}],"invited_audience":[{"id":"78761","name":"Faculty\/Staff"},{"id":"78771","name":"Public"},{"id":"78751","name":"Undergraduate students"}],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[],"email":[],"slides":[],"orientation":[],"userdata":""}}}