PhD Proposal by Kaige Xie

Title: Lifecycle-Oriented Optimization of Natural Language Generation Systems through Text Sub-Structures

Date: Monday, April 27th, 2026

Time: 1:00–3:00 PM ET

Location: online [Teams link]

Kaige Xie

Ph.D. Student in Computer Science

School of Interactive Computing

Georgia Institute of Technology

Committee:

Dr. Pascal Van Hentenryck (advisor) - School of Industrial and Systems Engineering and School of Interactive Computing, Georgia Institute of Technology

Dr. Thomas Ploetz - School of Interactive Computing, Georgia Institute of Technology

Dr. Chao Zhang - School of Computational Science and Engineering, Georgia Institute of Technology

Abstract:

This dissertation investigates how to optimize natural language generation (NLG) systems built on large language models (LLMs) from a holistic, lifecycle-oriented perspective. While recent advances in LLMs have led to substantial gains across a wide range of NLG tasks, prior research has largely focused on improving benchmark performance, often overlooking the broader challenges that arise across model training, inference, evaluation, and deployment. This dissertation argues that such a performance-centric view is insufficient for real-world NLG systems, whose success depends not only on output quality but also on efficiency, reasoning capability, evaluation fidelity, and user trust. To address this gap, the dissertation introduces a unified framework centered on text sub-structures—semantically meaningful intermediate representations embedded in text—and studies how their recognition and strategic utilization can improve NLG systems throughout their full lifecycle.

The dissertation develops this framework across four representative NLG tasks: dialogue summarization, story generation, action plan generation, and question answering. In dialogue summarization, it shows how dialogue skeletons can facilitate more effective few-shot learning and improve cross-task prompt transfer under limited supervision. In story generation, it demonstrates how outline-based planning structures can guide LLMs toward producing more coherent and engaging narratives. In action plan generation, it examines precondition-effect dependencies as a form of latent world knowledge that enables LLMs to better model action feasibility and environmental change. In question answering, it explores sub-questions as a versatile sub-structure for both fine-grained system evaluation and explanation generation, improving the assessment of open-ended retrieval-augmented generation systems and enhancing users’ ability to judge model reliability. Collectively, these studies show that text sub-structures provide a general and effective semantic scaffold for improving learning efficiency, inference-time planning and reasoning, evaluation robustness, and deployment-time user experience.

Media

No media selected

Summary

Lifecycle-Oriented Optimization of Natural Language Generation Systems through Text Sub-Structures

Details

Monday - Tuesday

Apr 27

2026

Apr 28

2026

01:00pm - 03:00pm

Location: TEAMS

In campus calendar: No

Sidebar Content

No sidebar content

Groups

Graduate Studies

Status

Workflow status: Published
Created by: Tatianna Richardson
Created: 04/21/2026
Modified By: Tatianna Richardson
Modified: 04/21/2026

Mercury (Hg)