<node id="682874">
  <nid>682874</nid>
  <type>event</type>
  <uid>
    <user id="27707"><![CDATA[27707]]></user>
  </uid>
  <created>1750785193</created>
  <changed>1750785193</changed>
  <title><![CDATA[PhD Defense by Yuchen Zhuang]]></title>
  <body><![CDATA[<p><strong>Title: Advancing Reasoning and Planning in Large Language Models via Reward Shaping</strong></p><p>&nbsp;</p><p><strong>Date:&nbsp;</strong>July 1st, 2025</p><p><strong>Time:&nbsp;</strong>4:30&nbsp;- 6:00 PM EST</p><p>Location: Online</p><p>Zoom link:&nbsp;<a href="https://gatech.zoom.us/j/99388025469" title="https://gatech.zoom.us/j/99388025469">https://gatech.zoom.us/j/99388025469</a></p><p>&nbsp;</p><p><strong>Yuchen Zhuang</strong></p><p>Machine Learning PhD Student</p><p>School of Computer Science and Engineering<br>Georgia Institute of Technology</p><p>&nbsp;</p><p><strong>Committee</strong></p><p>1 Dr. Chao Zhang (CSE, Georgia Tech, Advisor)</p><p>2 Dr. Bo Dai (CSE, Georgia Tech, Google DeepMind)</p><p>3 Dr. Tuo Zhao (ISYE, Georgia Tech)</p><p>4 Dr. Steve Mussmann (CS, Georgia Tech)</p><p>5 Dr. Sherry Yang (NYU, Google DeepMind)</p><p>&nbsp;</p><p><strong>Abstract</strong></p><p>Recent advancements in large language models (LLMs) have significantly enhanced their reasoning and planning capabilities, enabling them to serve effectively in complex, real-world scenarios. Despite these improvements, achieving human-level performance remains challenging, particularly for tasks requiring extensive multi-step reasoning and sophisticated planning. Motivated by these limitations, my dissertation focuses on improving the reasoning and planning abilities of LLMs through reward shaping to guide LLM decision-making by optimizing rewards for desired outcomes.</p><p>&nbsp;</p><p>The&nbsp;core contributions of this thesis are organized around three key aspects of effective and robust reasoning in LLM agents: (1) Formulating and Evaluating LLM-based Agents for External Tool Use. Effectively leveraging external tools is crucial for extending the practical utility of LLMs.&nbsp;(2) Efficient Action Space Navigation in LLM Agents. The complexity of multi-step planning tasks, involving numerous candidate actions, demands efficient exploration strategies. (3) Lightweight Adaptation for Black-Box LLM Personalization. The practical deployment of LLMs often involves adapting models to specific users without access to internal model parameters. Together, these thrusts represent a cohesive, data-centric strategy for enhancing LLM capabilities, systematically improving their ability to reason, plan, and adapt efficiently in complex, real-world environments.&nbsp;</p>]]></body>
  <field_summary_sentence>
    <item>
      <value><![CDATA[Advancing Reasoning and Planning in Large Language Models via Reward Shaping]]></value>
    </item>
  </field_summary_sentence>
  <field_summary>
    <item>
      <value><![CDATA[<p><strong>Advancing Reasoning and Planning in Large Language Models via Reward Shaping</strong></p>]]></value>
    </item>
  </field_summary>
  <field_time>
    <item>
      <value><![CDATA[2025-07-01T16:30:00-04:00]]></value>
      <value2><![CDATA[2025-07-01T18:00:00-04:00]]></value2>
      <rrule><![CDATA[]]></rrule>
      <timezone><![CDATA[America/New_York]]></timezone>
    </item>
  </field_time>
  <field_fee>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_fee>
  <field_extras>
      </field_extras>
  <field_audience>
          <item>
        <value><![CDATA[Public]]></value>
      </item>
      </field_audience>
  <field_media>
      </field_media>
  <field_contact>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_contact>
  <field_location>
    <item>
      <value><![CDATA[Zoom link]]></value>
    </item>
  </field_location>
  <field_sidebar>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_sidebar>
  <field_phone>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_phone>
  <field_url>
    <item>
      <url><![CDATA[]]></url>
      <title><![CDATA[]]></title>
            <attributes><![CDATA[]]></attributes>
    </item>
  </field_url>
  <field_email>
    <item>
      <email><![CDATA[]]></email>
    </item>
  </field_email>
  <field_boilerplate>
    <item>
      <nid><![CDATA[]]></nid>
    </item>
  </field_boilerplate>
  <links_related>
      </links_related>
  <files>
      </files>
  <og_groups>
          <item>221981</item>
      </og_groups>
  <og_groups_both>
          <item><![CDATA[Graduate Studies]]></item>
      </og_groups_both>
  <field_categories>
          <item>
        <tid>1788</tid>
        <value><![CDATA[Other/Miscellaneous]]></value>
      </item>
      </field_categories>
  <field_keywords>
          <item>
        <tid>100811</tid>
        <value><![CDATA[Phd Defense]]></value>
      </item>
      </field_keywords>
  <field_userdata><![CDATA[]]></field_userdata>
</node>
