<![CDATA[PhD Defense by Yuchen Zhuang]]>

682874 event 1750785193 1750785193 <![CDATA[PhD Defense by Yuchen Zhuang]]> Title: Advancing Reasoning and Planning in Large Language Models via Reward Shaping

Date: July 1st, 2025

Time: 4:30 - 6:00 PM EST

Location: Online

Zoom link: https://gatech.zoom.us/j/99388025469

Yuchen Zhuang

Machine Learning PhD Student

School of Computer Science and Engineering
Georgia Institute of Technology

Committee

1 Dr. Chao Zhang (CSE, Georgia Tech, Advisor)

2 Dr. Bo Dai (CSE, Georgia Tech, Google DeepMind)

3 Dr. Tuo Zhao (ISYE, Georgia Tech)

4 Dr. Steve Mussmann (CS, Georgia Tech)

5 Dr. Sherry Yang (NYU, Google DeepMind)

Abstract

Recent advancements in large language models (LLMs) have significantly enhanced their reasoning and planning capabilities, enabling them to serve effectively in complex, real-world scenarios. Despite these improvements, achieving human-level performance remains challenging, particularly for tasks requiring extensive multi-step reasoning and sophisticated planning. Motivated by these limitations, my dissertation focuses on improving the reasoning and planning abilities of LLMs through reward shaping to guide LLM decision-making by optimizing rewards for desired outcomes.

The core contributions of this thesis are organized around three key aspects of effective and robust reasoning in LLM agents: (1) Formulating and Evaluating LLM-based Agents for External Tool Use. Effectively leveraging external tools is crucial for extending the practical utility of LLMs. (2) Efficient Action Space Navigation in LLM Agents. The complexity of multi-step planning tasks, involving numerous candidate actions, demands efficient exploration strategies. (3) Lightweight Adaptation for Black-Box LLM Personalization. The practical deployment of LLMs often involves adapting models to specific users without access to internal model parameters. Together, these thrusts represent a cohesive, data-centric strategy for enhancing LLM capabilities, systematically improving their ability to reason, plan, and adapt efficiently in complex, real-world environments.

]]> Advancing Reasoning and Planning in Large Language Models via Reward Shaping

]]> <![CDATA[]]> 221981 1788 100811