{"652985":{"#nid":"652985","#data":{"type":"event","title":"PhD Defense by Jiachen Yang","body":[{"value":"\u003Cp\u003E\u003Cstrong\u003ETitle:\u0026nbsp;\u003C\/strong\u003ECooperation in Multi-Agent Reinforcement Learning\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EDate:\u0026nbsp;\u003C\/strong\u003ENovember 30th, Tuesday, 2021\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003ETime:\u0026nbsp;\u003C\/strong\u003E7:00-9:00 PM Eastern Time (4:00-6:00 PM Pacific Time)\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003ELocation\u003C\/strong\u003E:\u0026nbsp;\u003Ca href=\u0022https:\/\/bluejeans.com\/773753749\/7843\u0022\u003Ehttps:\/\/bluejeans.com\/773753749\/7843\u003C\/a\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EJiachen Yang\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003EMachine Learning PhD Candidate\u003C\/p\u003E\r\n\r\n\u003Cp\u003ESchool of Computational Science and Engineering\u003Cbr \/\u003E\r\nGeorgia Institute of Technology\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003ECommittee\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003E1. Dr. Hongyuan Zha (Advisor),\u0026nbsp;School of Computational Science and Engineering, Georgia Institute of Technology |\u0026nbsp;Executive Dean of School of Data Science, Chinese University of Hong Kong, Shenzhen\u003C\/p\u003E\r\n\r\n\u003Cp\u003E2. Dr. Tuo Zhao (Co-Advisor), School of Industrial and Systems Engineering, Georgia Institute of Technology\u003C\/p\u003E\r\n\r\n\u003Cp\u003E3. Dr. Charles Isbell, Dean of College of Computing, School of Interactive Computing, Georgia Institute of Technology\u003C\/p\u003E\r\n\r\n\u003Cp\u003E4. Dr. Matthew Gombolay, School of Interactive Computing, Georgia Institute of Technology\u003C\/p\u003E\r\n\r\n\u003Cp\u003E5. Dr. Daniel Faissol, Computational Engineering Division, Lawrence Livermore National Laboratory\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EAbstract\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003EAs progress in reinforcement learning (RL) gives rise to increasingly general and powerful artificial intelligence, society needs to anticipate a possible future in which multiple RL agents learn and interact in a shared multi-agent environment. When a single principal has oversight of the multi-agent system, how should agents learn to cooperate via centralized training to achieve individual and global objectives? When agents belong to self-interested principals with imperfectly aligned objectives, how can cooperation emerge from fully-decentralized learning? This dissertation addresses both questions by proposing novel methods for multi-agent reinforcement learning (MARL) and demonstrating the empirical effectiveness of these methods in high-dimensional simulated environments.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003ETo address the first case, we propose new algorithms for fully-cooperative MARL in the paradigm of centralized training with decentralized execution. Firstly, we propose a method based on multi-agent curriculum learning and multi-agent credit assignment to address the setting where global optimality is defined as the attainment of all individual goals. Secondly, we propose a hierarchical MARL algorithm to discover and learn interpretable and useful skills for a multi-agent team to optimize a single team objective. Extensive experiments with ablations show the strengths of our approaches over state-of-the-art baselines.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003ETo address the second case, we propose learning algorithms to attain cooperation within a population of self-interested RL agents. We propose the design of a new agent who is equipped with the new ability to incentivize other RL agents and explicitly account for the other agents\u0026#39; learning process. This agent overcomes the challenging limitation of fully-decentralized training and generates emergent cooperation in difficult social dilemmas. Then, we extend and apply this technique to the problem of incentive design, where a central incentive designer explicitly optimizes a global objective only by intervening on the rewards of a population of independent RL agents. Experiments on the problem of optimal taxation in a simulated market economy demonstrate the effectiveness of this approach.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n","summary":null,"format":"limited_html"}],"field_subtitle":"","field_summary":"","field_summary_sentence":[{"value":"Cooperation in Multi-Agent Reinforcement Learning"}],"uid":"27707","created_gmt":"2021-11-18 14:17:23","changed_gmt":"2021-11-18 14:17:23","author":"Tatianna Richardson","boilerplate_text":"","field_publication":"","field_article_url":"","field_event_time":{"event_time_start":"2021-11-30T19:00:00-05:00","event_time_end":"2021-11-30T21:00:00-05:00","event_time_end_last":"2021-11-30T21:00:00-05:00","gmt_time_start":"2021-12-01 00:00:00","gmt_time_end":"2021-12-01 02:00:00","gmt_time_end_last":"2021-12-01 02:00:00","rrule":null,"timezone":"America\/New_York"},"extras":[],"groups":[{"id":"221981","name":"Graduate Studies"}],"categories":[],"keywords":[{"id":"100811","name":"Phd Defense"}],"core_research_areas":[],"news_room_topics":[],"event_categories":[{"id":"1788","name":"Other\/Miscellaneous"}],"invited_audience":[{"id":"78771","name":"Public"},{"id":"174045","name":"Graduate students"},{"id":"78751","name":"Undergraduate students"}],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[],"email":[],"slides":[],"orientation":[],"userdata":""}}}