{"646146":{"#nid":"646146","#data":{"type":"event","title":"PhD Defense by Haoming Jiang","body":[{"value":"\u003Cp\u003E\u003Cstrong\u003ETitle\u003C\/strong\u003E:\u0026nbsp;Reducing Human Labor Cost in Deep Learning for Natural Language Processing\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EDate\u003C\/strong\u003E:\u0026nbsp;Thursday, April 15, 2021\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003ETime\u003C\/strong\u003E:\u0026nbsp;12:00PM (EST) \/ 9:00AM (PST)\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003ELocation\u003C\/strong\u003E: (BlueJeans meeting link)\u0026nbsp;\u003Ca href=\u0022https:\/\/bluejeans.com\/206131916\u0022 id=\u0022LPlnk620690\u0022\u003Ehttps:\/\/bluejeans.com\/206131916\u003C\/a\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EHaoming Jiang\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003EMachine Learning Ph.D. Student\u003C\/p\u003E\r\n\r\n\u003Cp\u003ESchool of Industrial and Systems Engineering\u003Cbr \/\u003E\r\nGeorgia Institute of Technology\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003ECommittee\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDr. Tuo Zhao\u0026nbsp;(Advisor), School of Industrial and Systems Engineering, Georgia Tech\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDr. Weizhu Chen, Microsoft Dynamics 365 AI, Microsoft\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDr. Yao Xie, School of Industrial and Systems Engineering, Georgia Tech\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDr. Diyi Yang,\u0026nbsp;School of Interactive Computing, Georgia Tech\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDr. Chao Zhang, School of Computational Science and Engineering, Georgia Tech\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EAbstract\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDeep learning has fundamentally changed the landscape of natural language processing (NLP). However, training deep learning models requires huge amounts of manually labeled data, which are prohibitive to obtain in some real-world applications. In addition, accurately evaluating models requires\u0026nbsp;human interaction, which is not affordable for large-scale experiments. This dissertation focuses on reducing such human labor costs in deep learning for NLP. Specifically, we develop novel frameworks for training deep learning models with limited\/noisy annotation and for estimating human evaluation scores:\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cem\u003ETraining with Limited Supervision\u003C\/em\u003E. \u0026nbsp;Many state-of-the-art models are first pre-trained on a large text corpus and then fine-tuned on downstream tasks. However, due to limited data resources from downstream tasks and the extremely high complexity of pre-trained models, aggressive fine-tuning often causes the fine-tuned model to overfit the training data of downstream tasks and fail to generalize well to unseen data. To address such an issue, we propose a new learning framework for robust and efficient fine-tuning for pre-trained models via regularized optimization. Our experiments show that the proposed framework achieves new state-of-the-art performance on many NLP tasks.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cem\u003ETraining with Weak Supervision.\u0026nbsp;\u003C\/em\u003EWhen manually labeled data is not available, we can leverage domain expert knowledge to generate weakly labeled data. The weak supervision, though does not require large amounts of manual annotations, yields highly incomplete and noisy weak labels via external knowledge bases. To address this challenge, we propose a two-stage self-training framework, which leverages the power of pre-trained language models to improve the prediction performance of NLP models. Thorough experiments on benchmark datasets demonstrate the superiority of the proposed framework.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cem\u003EDialogue Evaluation without Human Interaction.\u0026nbsp;\u003C\/em\u003EIn addition to the model training, we also address the problem of reliable human-free automatic evaluation for dialog systems. An ideal environment for evaluating dialog systems, also known as the Turing Test, needs to involve human interaction, which is usually not affordable for large-scale experiments. To bridge such a gap, we propose a new framework named ENIGMA for estimating human evaluation scores based on recent advances of off-policy evaluation in reinforcement learning.\u0026nbsp;Our experiments show that ENIGMA strongly correlates with human evaluation scores.\u003C\/p\u003E\r\n","summary":null,"format":"limited_html"}],"field_subtitle":"","field_summary":"","field_summary_sentence":[{"value":"Reducing Human Labor Cost in Deep Learning for Natural Language Processing"}],"uid":"27707","created_gmt":"2021-04-05 18:58:18","changed_gmt":"2021-04-05 18:58:18","author":"Tatianna Richardson","boilerplate_text":"","field_publication":"","field_article_url":"","field_event_time":{"event_time_start":"2021-04-15T13:00:00-04:00","event_time_end":"2021-04-15T15:00:00-04:00","event_time_end_last":"2021-04-15T15:00:00-04:00","gmt_time_start":"2021-04-15 17:00:00","gmt_time_end":"2021-04-15 19:00:00","gmt_time_end_last":"2021-04-15 19:00:00","rrule":null,"timezone":"America\/New_York"},"extras":[],"groups":[{"id":"221981","name":"Graduate Studies"}],"categories":[],"keywords":[{"id":"100811","name":"Phd Defense"}],"core_research_areas":[],"news_room_topics":[],"event_categories":[{"id":"1788","name":"Other\/Miscellaneous"}],"invited_audience":[{"id":"78761","name":"Faculty\/Staff"},{"id":"78771","name":"Public"},{"id":"78751","name":"Undergraduate students"}],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[],"email":[],"slides":[],"orientation":[],"userdata":""}}}