<![CDATA[ML Ph.D. Thesis Proposal: Jiachen Yang]]>

641567 event 1606150797 1606150909 <![CDATA[ML Ph.D. Thesis Proposal: Jiachen Yang]]> Title: Cooperation in Multi-Agent Reinforcement Learning

Date: Thursday, December 3rd, 2020

Time: 1:00 pm - 2:30 pm Eastern time

Location: https://bluejeans.com/684552748

Student

Jiachen Yang

Machine Learning PhD Student

Computational Science and Engineering

Georgia Institute of Technology

Committee

Dr. Hongyuan Zha (advisor) - School of Computational Science and Engineering, Georgia Institute of Technology)
Dr. Tuo Zhao - School of Industrial and Systems Engineering, Georgia Institute of Technology
Dr. Charles Isbell - School of Interactive Computing, Georgia Institute of Technology

Abstract

As progress in deep reinforcement learning (RL) gives rise to increasingly general and powerful artificial intelligence, there is a possible future in which multiple RL agents must learn and interact in a shared multi-agent environment. When a single principal has oversight of the multi-agent system, how should agents learn to cooperate via centralized training to achieve individual and global objectives? Alternatively, when agents belong to many self-interested principals with imperfectly-aligned objectives, how can cooperation emerge from fully-decentralized learning?

In the first part of the thesis, we propose new algorithms for fully-cooperative multi-agent reinforcement learning (MARL) in the paradigm of centralized training with decentralized execution. Firstly, we propose a method based on multi-agent curriculum learning and multi-agent credit assignment to address the setting where global optimality is defined as the attainment of all individual goals. Secondly, we propose a hierarchical MARL algorithm to learn interpretable and useful skills for a multi-agent team to optimize a single shared reward.

In the second part, we propose learning algorithms to attain cooperation within a population of self-interested RL agents. We show that a new agent who is equipped with the new ability to incentivize other RL agents, and who explicitly accounts for the other agents' learning process, can overcome the challenging limitation of fully-decentralized training and generate emergent cooperation. Building on successful techniques in the completed work, we propose in the remaining work to address two complex applications of MARL: 1) the problem of incentive design for in silico experimental economics, where one wishes to optimize a global objective only by intervening on the rewards of a population of independent RL agents; 2) the problem of adaptive mesh refinement in the finite element method for solving large-scale physical simulations of complex dynamics.

]]> <![CDATA[]]> 576481 1788