event

PhD Defense | Advancing Reasoning and Planning in Large Language Models via Reward Shaping

Primary tabs

Title: Advancing Reasoning and Planning in Large Language Models via Reward Shaping

 

Date: July 1st, 2025

Time: 4:30 - 6:00 PM EST

Location: Online

Zoom link: https://gatech.zoom.us/j/99388025469

 

Yuchen Zhuang

Machine Learning PhD Student

School of Computer Science and Engineering
Georgia Institute of Technology

 

Committee

1 Dr. Chao Zhang (CSE, Georgia Tech, Advisor)

2 Dr. Bo Dai (CSE, Georgia Tech, Google DeepMind)

3 Dr. Tuo Zhao (ISYE, Georgia Tech)

4 Dr. Steve Mussmann (CS, Georgia Tech)

5 Dr. Sherry Yang (NYU, Google DeepMind)

 

Abstract

Recent advancements in large language models (LLMs) have significantly enhanced their reasoning and planning capabilities, enabling them to serve effectively in complex, real-world scenarios. Despite these improvements, achieving human-level performance remains challenging, particularly for tasks requiring extensive multi-step reasoning and sophisticated planning. Motivated by these limitations, my dissertation focuses on improving the reasoning and planning abilities of LLMs through reward shaping to guide LLM decision-making by optimizing rewards for desired outcomes.

 

The core contributions of this thesis are organized around three key aspects of effective and robust reasoning in LLM agents: (1) Formulating and Evaluating LLM-based Agents for External Tool Use. Effectively leveraging external tools is crucial for extending the practical utility of LLMs. (2) Efficient Action Space Navigation in LLM Agents. The complexity of multi-step planning tasks, involving numerous candidate actions, demands efficient exploration strategies. (3) Lightweight Adaptation for Black-Box LLM Personalization. The practical deployment of LLMs often involves adapting models to specific users without access to internal model parameters. Together, these thrusts represent a cohesive, data-centric strategy for enhancing LLM capabilities, systematically improving their ability to reason, plan, and adapt efficiently in complex, real-world environments. 

Groups

Status

  • Workflow Status:Published
  • Created By:shatcher8
  • Created:06/24/2025
  • Modified By:shatcher8
  • Modified:06/24/2025

Categories

  • No categories were selected.

Keywords

  • No keywords were submitted.