{"678807":{"#nid":"678807","#data":{"type":"event","title":"PhD Defense by Chao Jiang ","body":[{"value":"\u003Cp\u003E\u003Cstrong\u003ETitle:\u0026nbsp;\u003C\/strong\u003EStudying Text Revision in Scientific Writing\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EDate\/Time\u003C\/strong\u003E: December 16th, 2024, 4 PM -- 6 PM EST (1 PM -- 3 PM PST)\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003ELocation\u003C\/strong\u003E: Coda C1115 Druid Hills\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EZoom\u003C\/strong\u003E:\u0026nbsp;\u003Ca href=\u0022https:\/\/gatech.zoom.us\/j\/9861666067?pwd=MkpxYWRjUUdJeWxDUHBzUmF5RVI5Zz09\u0026amp;omn=92372876609\u0022 title=\u0022https:\/\/gatech.zoom.us\/j\/9861666067?pwd=MkpxYWRjUUdJeWxDUHBzUmF5RVI5Zz09\u0026amp;omn=92372876609\u0022\u003Ehttps:\/\/gatech.zoom.us\/j\/9861666067?pwd=MkpxYWRjUUdJeWxDUHBzUmF5RVI5Zz09\u0026amp;omn=92372876609\u003C\/a\u003E\u0026nbsp;\u0026nbsp;(Meeting ID: 986 166 6067\u0026nbsp;\u0026nbsp; Passcode: 964018)\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EChao Jiang\u0026nbsp;\u003C\/strong\u003E(\u003Ca href=\u0022https:\/\/chaojiang06.github.io\/\u0022\u003EHomepage\u003C\/a\u003E)\u003C\/p\u003E\u003Cp\u003EPh.D. Candidate in Computer Science\u003C\/p\u003E\u003Cp\u003ESchool of Interactive Computing\u003C\/p\u003E\u003Cp\u003EGeorgia Institute of Technology\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003ECommittee:\u003C\/strong\u003E\u003C\/p\u003E\u003Cp\u003EDr. Wei Xu (advisor), School of Interactive Computing, Georgia Tech\u003C\/p\u003E\u003Cp\u003EDr. Alan Ritter, School of Interactive Computing, Georgia Tech\u003C\/p\u003E\u003Cp\u003EDr. Kartik Goyal, School of Interactive Computing, Georgia Tech\u003C\/p\u003E\u003Cp\u003EDr. Nanyun Peng, Computer Science Department, UCLA\u003C\/p\u003E\u003Cp\u003EDr. Cheng Li, Google DeepMind\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EAbstract:\u003C\/strong\u003E\u003C\/p\u003E\u003Cp\u003EWriting is essential for sharing scientific discoveries, and researchers devote significant effort to revising their papers to improve writing quality and incorporate new findings. The revision process encodes valuable knowledge, including logical and structural improvements at the document level and stylistic and grammatical refinements at the sentence and word levels. This dissertation presents a complete computational framework for extracting text revisions across different granularity, and analyzing edits made for different purposes.\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EExtracting human-made revisions requires accurately matching text snippets before and after editing. In this talk, I will first present our contribution to the state-of-the-art methods for monolingual sentence alignment. We propose a neural CRF model that captures sequential dependencies and semantic similarity between sentences in parallel documents. The proposed approach outperforms previous methods by a large margin, and enables the creation of high-quality text revision datasets. Next, to study fine-grained editing operations within sentences, we design a novel neural semi-Markov CRF alignment model for monolingual word\/phrase alignment. This model unifies word and phrase alignments using variable-length spans and achieves state-of-the-art performance on both in-domain and out-of-domain evaluations.\u0026nbsp; It also demonstrates utility in downstream tasks, such as automatic text simplification and sentence pair classification tasks.\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EWe further present arXivEdits, a dataset containing human-annotated sentence alignments and fine-grained span-level edits across multiple versions of 751 research papers. Enabled by this corpus, we perform detailed analysis of revision strategies in scientific writing, revealing common practices researchers use to improve their paper. Finally, this dissertation explores human revision from a readability perspective through MedReadMe, a new dataset consisting of sentence-level readability ratings and complex span annotations for 4,520 medical sentences. This dataset supports fine-grained readability analysis and the evaluation of state-of-the-art readability metrics. By incorporating novel features, we significantly improve their correlation with human judgments.\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E","summary":"","format":"limited_html"}],"field_subtitle":"","field_summary":[{"value":"\u003Cp\u003E\u003Cstrong\u003EStudying Text Revision in Scientific Writing\u003C\/strong\u003E\u003C\/p\u003E","format":"limited_html"}],"field_summary_sentence":[{"value":"Studying Text Revision in Scientific Writing"}],"uid":"27707","created_gmt":"2024-12-10 20:15:13","changed_gmt":"2024-12-10 20:15:51","author":"Tatianna Richardson","boilerplate_text":"","field_publication":"","field_article_url":"","field_event_time":{"event_time_start":"2024-12-16T16:00:00-05:00","event_time_end":"2024-12-16T18:00:00-05:00","event_time_end_last":"2024-12-16T18:00:00-05:00","gmt_time_start":"2024-12-16 21:00:00","gmt_time_end":"2024-12-16 23:00:00","gmt_time_end_last":"2024-12-16 23:00:00","rrule":null,"timezone":"America\/New_York"},"location":"Coda C1115 Druid Hills","extras":[],"groups":[{"id":"221981","name":"Graduate Studies"}],"categories":[],"keywords":[{"id":"100811","name":"Phd Defense"}],"core_research_areas":[],"news_room_topics":[],"event_categories":[{"id":"1788","name":"Other\/Miscellaneous"}],"invited_audience":[{"id":"78771","name":"Public"}],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[],"email":[],"slides":[],"orientation":[],"userdata":""}}}