event
PhD Proposal by Kaige Xie
Primary tabs
Title: Lifecycle-Oriented Optimization of Natural Language Generation Systems through Text Sub-Structures
Date: Monday, April 27th, 2026
Time: 1:00–3:00 PM ET
Location: online [Teams link]
Kaige Xie
Ph.D. Student in Computer Science
School of Interactive Computing
Georgia Institute of Technology
Committee:
Dr. Pascal Van Hentenryck (advisor) - School of Industrial and Systems Engineering and School of Interactive Computing, Georgia Institute of Technology
Dr. Thomas Ploetz - School of Interactive Computing, Georgia Institute of Technology
Dr. Chao Zhang - School of Computational Science and Engineering, Georgia Institute of Technology
Abstract:
This dissertation investigates how to optimize natural language generation (NLG) systems built on large language models (LLMs) from a holistic, lifecycle-oriented perspective. While recent advances in LLMs have led to substantial gains across a wide range of NLG tasks, prior research has largely focused on improving benchmark performance, often overlooking the broader challenges that arise across model training, inference, evaluation, and deployment. This dissertation argues that such a performance-centric view is insufficient for real-world NLG systems, whose success depends not only on output quality but also on efficiency, reasoning capability, evaluation fidelity, and user trust. To address this gap, the dissertation introduces a unified framework centered on text sub-structures—semantically meaningful intermediate representations embedded in text—and studies how their recognition and strategic utilization can improve NLG systems throughout their full lifecycle.
The dissertation develops this framework across four representative NLG tasks: dialogue summarization, story generation, action plan generation, and question answering. In dialogue summarization, it shows how dialogue skeletons can facilitate more effective few-shot learning and improve cross-task prompt transfer under limited supervision. In story generation, it demonstrates how outline-based planning structures can guide LLMs toward producing more coherent and engaging narratives. In action plan generation, it examines precondition-effect dependencies as a form of latent world knowledge that enables LLMs to better model action feasibility and environmental change. In question answering, it explores sub-questions as a versatile sub-structure for both fine-grained system evaluation and explanation generation, improving the assessment of open-ended retrieval-augmented generation systems and enhancing users’ ability to judge model reliability. Collectively, these studies show that text sub-structures provide a general and effective semantic scaffold for improving learning efficiency, inference-time planning and reasoning, evaluation robustness, and deployment-time user experience.
Groups
Status
- Workflow status: Published
- Created by: Tatianna Richardson
- Created: 04/21/2026
- Modified By: Tatianna Richardson
- Modified: 04/21/2026
Categories
Keywords
User Data
Target Audience