event
PhD Proposal by Chao Jiang
Primary tabs
Title: Studying Text Revision in Scientific Writing
Date/Time: Jan. 4th, 2024, 1:30 PM - 3:30 PM ET (10:30 AM - 12:30 PM PST)
Location: Zoom Link
Ph.D. Student in Computer Science
School of Interactive Computing
Georgia Institute of Technology
Committee:
Dr. Wei Xu (advisor), School of Interactive Computing, Georgia Tech
Dr. Alan Ritter, School of Interactive Computing, Georgia Tech
Dr. Kartik Goyal, School of Interactive Computing, Georgia Tech
Dr. Nanyun Peng, Computer Science Department, University of California, Los Angeles
Abstract:
Scientific publications are the primary channel for sharing research findings. Researchers devote a huge amount of effort to improving the writing quality, and valuable knowledge is encoded in the revision process. Up to December 28th, 2023, arXiv (https://arxiv.org/), an open-access e-print service, has archived over 2.3 million papers, among which more than 600k papers have multiple versions available. This provides an amazing data source for studying text revision in scientific writing. Specifically, revisions between different versions of papers contain valuable information about logical and structural improvements at document-level, as well as stylistic and grammatical refinements at sentence- and word-levels. However, it also poses a unique challenge: the vast amount of data demands efficient and effective techniques to extract and analyze text revisions. This thesis focuses on (1) developing sentence and word alignment methods to extract revision at different granularity; (2) constructing a new dataset to analyze fine-grained edits and their underlying intention; and (3) analyzing the human revision in medical literature from a readability perspective, which is crucial for disseminating scientific knowledge to a broader audience.
Groups
Status
- Workflow Status:Published
- Created By:Tatianna Richardson
- Created:01/04/2024
- Modified By:Tatianna Richardson
- Modified:01/04/2024
Categories
Keywords
Target Audience