<node id="641567">
  <nid>641567</nid>
  <type>event</type>
  <uid>
    <user id="34773"><![CDATA[34773]]></user>
  </uid>
  <created>1606150797</created>
  <changed>1606150909</changed>
  <title><![CDATA[ML Ph.D. Thesis Proposal: Jiachen Yang]]></title>
  <body><![CDATA[<p><strong>Title</strong>:&nbsp;Cooperation in Multi-Agent Reinforcement Learning</p>

<p><strong>Date</strong>: Thursday, December 3rd, 2020</p>

<p><strong>Time</strong>: 1:00 pm - 2:30 pm Eastern time</p>

<p><strong>Location</strong>:&nbsp;<a href="https://bluejeans.com/684552748" id="LPlnk711156">https://bluejeans.com/684552748</a></p>

<p>&nbsp;</p>

<h4><strong>Student</strong></h4>

<p><strong>Jiachen Yang</strong></p>

<p>Machine Learning PhD Student</p>

<p>Computational Science and Engineering</p>

<p>Georgia Institute of Technology</p>

<p>&nbsp;</p>

<h4><strong>Committee</strong></h4>

<ul>
	<li>Dr. Hongyuan Zha (advisor)&nbsp;- School of Computational Science and Engineering, Georgia Institute of Technology)</li>
	<li>Dr. Tuo Zhao -&nbsp;School of Industrial and Systems Engineering,&nbsp;Georgia Institute of Technology</li>
	<li>Dr. Charles Isbell -&nbsp;School of Interactive Computing,&nbsp;Georgia Institute of Technology</li>
</ul>

<h4>&nbsp;</h4>

<h4><strong>Abstract</strong></h4>

<p>As progress in deep reinforcement learning (RL) gives rise to increasingly general and powerful artificial intelligence, there is a possible future in which multiple RL agents must learn and interact in a shared multi-agent environment. When a single principal has oversight of the multi-agent system, how should agents learn to cooperate via centralized training to achieve individual and global objectives? Alternatively, when agents belong to many self-interested principals with imperfectly-aligned objectives, how can cooperation emerge from fully-decentralized learning?&nbsp;</p>

<p>In the first part of the thesis, we propose new algorithms for fully-cooperative multi-agent reinforcement learning (MARL) in the paradigm of centralized training with decentralized execution. Firstly, we propose a method based on multi-agent curriculum learning and multi-agent credit assignment to address the setting where global optimality is defined as the attainment of all individual goals. Secondly, we propose a hierarchical MARL algorithm to learn interpretable and useful skills for a multi-agent team to optimize a single shared reward.</p>

<p>In the second part, we propose learning algorithms to attain cooperation within a population of self-interested RL agents. We show that a new agent who is equipped with the new ability to incentivize other RL agents, and who explicitly accounts for the other agents&#39; learning process, can overcome the challenging limitation of fully-decentralized training and generate emergent cooperation.&nbsp;Building on successful techniques in the completed work, we propose in the remaining work to address two complex applications of MARL: 1) the problem of incentive design for&nbsp;<em>in silico</em>&nbsp;experimental economics, where one wishes to optimize a global objective only by intervening on the rewards of a population of independent RL agents; 2) the problem of adaptive mesh refinement in the finite element method for solving large-scale physical simulations of complex dynamics.</p>
]]></body>
  <field_summary_sentence>
    <item>
      <value><![CDATA[ML@GT Ph.D. student Jiachen Yang will defend his thesis proposal.]]></value>
    </item>
  </field_summary_sentence>
  <field_summary>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_summary>
  <field_time>
    <item>
      <value><![CDATA[2020-12-03T13:00:00-05:00]]></value>
      <value2><![CDATA[2020-12-03T14:30:00-05:00]]></value2>
      <rrule><![CDATA[]]></rrule>
      <timezone><![CDATA[America/New_York]]></timezone>
    </item>
  </field_time>
  <field_fee>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_fee>
  <field_extras>
      </field_extras>
  <field_audience>
          <item>
        <value><![CDATA[Faculty/Staff]]></value>
      </item>
          <item>
        <value><![CDATA[Postdoc]]></value>
      </item>
          <item>
        <value><![CDATA[Public]]></value>
      </item>
          <item>
        <value><![CDATA[Graduate students]]></value>
      </item>
          <item>
        <value><![CDATA[Undergraduate students]]></value>
      </item>
      </field_audience>
  <field_media>
      </field_media>
  <field_contact>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_contact>
  <field_location>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_location>
  <field_sidebar>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_sidebar>
  <field_phone>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_phone>
  <field_url>
    <item>
      <url><![CDATA[]]></url>
      <title><![CDATA[]]></title>
            <attributes><![CDATA[]]></attributes>
    </item>
  </field_url>
  <field_email>
    <item>
      <email><![CDATA[]]></email>
    </item>
  </field_email>
  <field_boilerplate>
    <item>
      <nid><![CDATA[]]></nid>
    </item>
  </field_boilerplate>
  <links_related>
      </links_related>
  <files>
      </files>
  <og_groups>
          <item>576481</item>
      </og_groups>
  <og_groups_both>
          <item><![CDATA[ML@GT]]></item>
      </og_groups_both>
  <field_categories>
          <item>
        <tid>1788</tid>
        <value><![CDATA[Other/Miscellaneous]]></value>
      </item>
      </field_categories>
  <field_keywords>
      </field_keywords>
  <field_userdata><![CDATA[]]></field_userdata>
</node>
