PhD Defense by Haoming Jiang

Title: Reducing Human Labor Cost in Deep Learning for Natural Language Processing

Date: Thursday, April 15, 2021

Time: 12:00PM (EST) / 9:00AM (PST)

Location: (BlueJeans meeting link) https://bluejeans.com/206131916

Haoming Jiang

Machine Learning Ph.D. Student

School of Industrial and Systems Engineering
Georgia Institute of Technology

Committee

Dr. Tuo Zhao (Advisor), School of Industrial and Systems Engineering, Georgia Tech

Dr. Weizhu Chen, Microsoft Dynamics 365 AI, Microsoft

Dr. Yao Xie, School of Industrial and Systems Engineering, Georgia Tech

Dr. Diyi Yang, School of Interactive Computing, Georgia Tech

Dr. Chao Zhang, School of Computational Science and Engineering, Georgia Tech

Abstract

Deep learning has fundamentally changed the landscape of natural language processing (NLP). However, training deep learning models requires huge amounts of manually labeled data, which are prohibitive to obtain in some real-world applications. In addition, accurately evaluating models requires human interaction, which is not affordable for large-scale experiments. This dissertation focuses on reducing such human labor costs in deep learning for NLP. Specifically, we develop novel frameworks for training deep learning models with limited/noisy annotation and for estimating human evaluation scores:

Training with Limited Supervision. Many state-of-the-art models are first pre-trained on a large text corpus and then fine-tuned on downstream tasks. However, due to limited data resources from downstream tasks and the extremely high complexity of pre-trained models, aggressive fine-tuning often causes the fine-tuned model to overfit the training data of downstream tasks and fail to generalize well to unseen data. To address such an issue, we propose a new learning framework for robust and efficient fine-tuning for pre-trained models via regularized optimization. Our experiments show that the proposed framework achieves new state-of-the-art performance on many NLP tasks.

Training with Weak Supervision. When manually labeled data is not available, we can leverage domain expert knowledge to generate weakly labeled data. The weak supervision, though does not require large amounts of manual annotations, yields highly incomplete and noisy weak labels via external knowledge bases. To address this challenge, we propose a two-stage self-training framework, which leverages the power of pre-trained language models to improve the prediction performance of NLP models. Thorough experiments on benchmark datasets demonstrate the superiority of the proposed framework.

Dialogue Evaluation without Human Interaction. In addition to the model training, we also address the problem of reliable human-free automatic evaluation for dialog systems. An ideal environment for evaluating dialog systems, also known as the Turing Test, needs to involve human interaction, which is usually not affordable for large-scale experiments. To bridge such a gap, we propose a new framework named ENIGMA for estimating human evaluation scores based on recent advances of off-policy evaluation in reinforcement learning. Our experiments show that ENIGMA strongly correlates with human evaluation scores.

Media

No media selected

Summary

Details

Thursday

Apr 15 2021

01:00pm - 03:00pm

URL: https://bluejeans.com/206131916

In campus calendar: No

Sidebar Content

No sidebar content

Groups

Graduate Studies

Status

Workflow status: Published
Created by: Tatianna Richardson
Created: 04/05/2021
Modified By: Tatianna Richardson
Modified: 04/05/2021

Mercury (Hg)

PhD Defense by Haoming Jiang

Log in

Georgia Institute of Technology

PhD Defense by Haoming Jiang

Primary tabs

Log in

Georgia Institute of Technology