event

PhD Proposal by Jiaao Chen

Primary tabs

Title: Efficient and Adaptive Machine Learning for Natural Language Processing

Date/Time: Nov 20, 2023, 12:00 PM to 2:00 PM Eastern Time (US)

Location: Zoom Link

Meeting ID: 915 0053 1039

Passcode: 077482

 

Jiaao Chen 

Ph.D. Candidate in Computer Science

School of Interactive Computing

Georgia Institute of Technology

 

Committee:

Dr. Diyi Yang (advisor), Computer Science Department, Stanford University

Dr. Mark Riedl (co-advisor), School of Interactive Computing, Georgia Tech

Dr. Alan Ritter, School of Interactive Computing, Georgia Tech

Dr. Zsolt Kira, School of Interactive Computing, Georgia Tech

Dr. Colin Raffle, Department of Computer Science, University of Toronto

 

 

Abstract:

In this thesis, we advocate for efficient and adaptive machine learning for NLP, endeavoring to make NLP models benefit real-world applications. While current NLP has recently undergone a transformational shift towards the development and application of Large Language Models (LLMs), which represent a significant leap in performance, the extensive computational resources and vast textual datasets they rely on create key challenges in environments with limited resources such as limited data, computational power, memory, and specialized expertise. We argue that developing machine learning methods that could efficiently adapt with limited resources is particularly vital in the era of LLMs to make the benefits of the technology sustainable, accessible, and generalizable.

 

We explore this perspective by diving into three components of the learning process of NLP models: (a) improving data efficiency via data augmentation and semi-supervised learning, through which we could greatly alleviate the dependence on the need of labeled data to adapt NLP models to data-limited scenarios; (b) incorporating structures when learning NLP models, through which we could explicitly utilize rich hidden structures beyond just text to enhance the learning efficiency, making the NLP models more generalizable to novel settings; (c) improving training efficiency via parameter-efficient fine-tuning and continual learning, through which we can adapt large language models efficiently to emerging data and tasks with limited computation and memory requirements. Building on these three dimensions, we further propose the efficient and adaptive framework that could efficiently adapt LLMs to novel settings with minimal human efforts. Throughout this thesis, the ultimate goal is to democratize NLP functionalities, making them more accessible and adaptable for languages with scarce resources, specialized fields with unique needs, and nascent applications that conventional NLP methodologies fail to accommodate.

 

Status

  • Workflow Status:Published
  • Created By:Tatianna Richardson
  • Created:11/13/2023
  • Modified By:Tatianna Richardson
  • Modified:11/13/2023

Categories

Keywords

Target Audience