<![CDATA[PhD Defense by Alex Havrilla]]>

682735 event 1749484422 1749484483 <![CDATA[PhD Defense by Alex Havrilla]]> Alex Havrilla

Title: Towards a Theory and Practice of Open-ended Reasoning with Generative Models

Date: 6/13/2025

Time: 1 PM

Location:

- In-person: Skiles 202

- Remote: https://gatech.zoom.us/j/5472104648

Alexander Havrilla

Machine Learning PhD Student

School of Mathematics

Georgia Institute of Technology

Committee

1 Dr. Wenjing Liao, School of Mathematics (Advisor), Georgia Tech

2 Dr. Mark Riedl, School of Interactive Computing, Georgia Tech

3 Dr. Tuo Zhao, School of Industrial and Systems Engineering, Georgia Tech

4 Dr. Jacob Abernethy, School of Interactive Computing, Georgia Tech

5 Dr. David Alvarez-Melis, School of Engineering and Applied Sciences, Harvard

Abstract

Driven by advancements in large language modeling (LLMs), the last several years have seen an explosion in AI reasoning capability. In this dissertation, we characterize two distinct types of reasoning: closed-ended reasoning versus open-ended reasoning. We define closed-ended reasoning as the systematic application of a defined set of rules to reach a desired outcome. In contrast, we describe open-ended reasoning as a less structured process, often requiring the creation or adaptation of new rule sets themselves, and characterized by a greater need for exploration and discovery. While LLMs increasingly excel at closed-ended reasoning, they struggle more with problems requiring the open-ended counterpart. We study both types of reasoning in three parts. First, by establishing novel approximation and statistical theory for LLMs. This theory elucidates data complexity as a driving factor behind scaling laws, which themselves have a strong downstream effect on reasoning ability. Then, to improve reasoning ability in practice, we develop a novel RL framework for LLMs, trlX, which is used to fine-tune LLMs on reasoning problems. Our analysis reveals the exploration ability of LLMs as a key bottleneck to future improvement via RL. This leads us to propose SPARQ: a self-improvement style synthetic data generation algorithm drawing on techniques from the quality-diversity (QD) literature to improve both the correctness and diversity of LLM reasoning. We conclude by discussing open problems and future directions for better open-ended AI reasoning.

]]> Towards a Theory and Practice of Open-ended Reasoning with Generative Models

]]> <![CDATA[]]> 221981 1788 100811