{"678747":{"#nid":"678747","#data":{"type":"news","title":"New Dataset Takes Aim at Subjective Misinformation in Earnings Calls and Other Public Hearings","body":[{"value":"\u003Cp\u003EGeorgia Tech researchers have created a dataset that trains computer models to understand nuances in human speech during financial earnings calls. The dataset provides a new resource to study how public correspondence affects businesses and markets.\u0026nbsp;\u003C\/p\u003E\u003Cp\u003ESubjECTive-QA is the first human-curated dataset on question-answer pairs from earnings call transcripts (ECTs). The dataset teaches models to identify subjective features in ECTs, like clarity and cautiousness. \u0026nbsp;\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EThe dataset lays the foundation for a new approach to identifying disinformation and misinformation caused by nuances in speech. While ECT responses can be technically true, unclear or irrelevant information can misinform stakeholders and affect their decision-making.\u0026nbsp;\u003C\/p\u003E\u003Cp\u003ETests on White House press briefings showed that the dataset applies to other sectors with frequent question-and-answer encounters, notably politics, journalism, and sports. This increases the odds of effectively informing audiences and improving transparency across public spheres.\u0026nbsp; \u0026nbsp;\u003C\/p\u003E\u003Cp\u003EThe intersecting work between natural language processing and finance earned\u0026nbsp;\u003Ca href=\u0022https:\/\/arxiv.org\/pdf\/2410.20651\u0022\u003E\u003Cstrong\u003Ethe paper\u003C\/strong\u003E\u003C\/a\u003E acceptance to\u0026nbsp;\u003Ca href=\u0022https:\/\/neurips.cc\/\u0022\u003E\u003Cstrong\u003ENeurIPS 2024\u003C\/strong\u003E\u003C\/a\u003E, the 38th Annual Conference on Neural Information Processing Systems. NeurIPS is one of the world\u2019s most prestigious conferences on artificial intelligence (AI) and machine learning (ML) research.\u003C\/p\u003E\u003Cp\u003E\u0022SubjECTive-QA has the potential to revolutionize nowcasting predictions with enhanced clarity and relevance,\u201d said\u0026nbsp;\u003Ca href=\u0022https:\/\/shahagam4.github.io\/\u0022\u003E\u003Cstrong\u003EAgam Shah\u003C\/strong\u003E\u003C\/a\u003E, the project\u2019s lead researcher.\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u201cIts nuanced analysis of qualities in executive responses, like optimism and cautiousness, deepens our understanding of economic forecasts and financial transparency.\u0022\u003C\/p\u003E\u003Cp\u003E[\u003Ca href=\u0022https:\/\/sites.gatech.edu\/research\/neurips-2024\/\u0022\u003E\u003Cstrong\u003EMICROSITE: Georgia Tech at NeurIPS 2024\u003C\/strong\u003E\u003C\/a\u003E]\u003C\/p\u003E\u003Cp\u003ESubjECTive-QA offers a new means to evaluate financial discourse by characterizing language\u0027s subjective and multifaceted nature. This improves on traditional datasets that quantify sentiment or verify claims from financial statements.\u003C\/p\u003E\u003Cp\u003EThe dataset consists of 2,747 Q\u0026amp;A pairs taken from 120 ECTs from companies listed on the New York Stock Exchange from 2007 to 2021. The Georgia Tech researchers annotated each response by hand based on six features for a total of 49,446 annotations.\u003C\/p\u003E\u003Cp\u003EThe group evaluated answers on:\u003C\/p\u003E\u003Cul\u003E\u003Cli\u003ERelevance: the speaker answered the question with appropriate details.\u003C\/li\u003E\u003Cli\u003EClarity: the speaker was transparent in the answer and the message conveyed.\u003C\/li\u003E\u003Cli\u003EOptimism: the speaker answered with a positive outlook regarding future outcomes.\u003C\/li\u003E\u003Cli\u003ESpecificity: the speaker included sufficient and technical details in their answer.\u003C\/li\u003E\u003Cli\u003ECautiousness: the speaker answered using a conservative, risk-averse approach.\u003C\/li\u003E\u003Cli\u003EAssertiveness: the speaker answered with certainty about the company\u2019s events and outcomes.\u003C\/li\u003E\u003C\/ul\u003E\u003Cp\u003EThe Georgia Tech group validated their dataset by training eight computer models to detect and score these six features. Test models comprised of three BERT-based pre-trained language models (PLMs), and five popular large language models (LLMs) including Llama and ChatGPT.\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EAll eight models scored the highest on the relevance and clarity features. This is attributed to domain-specific pretraining that enables the models to identify pertinent and understandable material.\u003C\/p\u003E\u003Cp\u003EThe PLMs achieved higher scores on the clear, optimistic, specific, and cautious categories. The LLMs scored higher in assertiveness and relevance.\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EIn another experiment to test transferability, a PLM trained with SubjECTive-QA evaluated 65 Q\u0026amp;A pairs from White House press briefings and gaggles. Scores across all six features indicated models trained on the dataset could succeed in other fields outside of finance.\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u0022Building on these promising results, the next step for SubjECTive-QA is to enhance customer service technologies, like chatbots,\u201d said Shah, a Ph.D. candidate studying machine learning.\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u201cWe want to make these platforms more responsive and accurate by integrating our analysis techniques from SubjECTive-QA.\u0022\u003C\/p\u003E\u003Cp\u003ESubjECTive-QA culminated from two semesters of work through Georgia Tech\u2019s Vertically Integrated Projects (VIP) Program. The\u0026nbsp;\u003Ca href=\u0022https:\/\/vip.gatech.edu\/\u0022\u003E\u003Cstrong\u003EVIP Program\u003C\/strong\u003E\u003C\/a\u003E is an approach to higher education where undergraduate and graduate students work together on long-term project teams led by faculty.\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EUndergraduate students earn academic credit and receive hands-on experience through VIP projects. The extra help advances ongoing research and gives graduate students mentorship experience.\u003C\/p\u003E\u003Cp\u003EComputer science major\u0026nbsp;\u003Ca href=\u0022http:\/\/pardawalahuzaifa.me\/\u0022\u003E\u003Cstrong\u003EHuzaifa Pardawala\u003C\/strong\u003E\u003C\/a\u003E and mathematics major\u0026nbsp;\u003Ca href=\u0022https:\/\/www.linkedin.com\/in\/siddhantsukhani\/\u0022\u003E\u003Cstrong\u003ESiddhant Sukhani\u003C\/strong\u003E\u003C\/a\u003E co-led the SubjECTive-QA project with Shah.\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EFellow collaborators included\u0026nbsp;\u003Ca href=\u0022https:\/\/www.linkedin.com\/in\/veerkejriwal\/\u0022\u003E\u003Cstrong\u003EVeer Kejriwal\u003C\/strong\u003E\u003C\/a\u003E,\u0026nbsp;\u003Ca href=\u0022https:\/\/www.linkedin.com\/in\/abhipi\/\u0022\u003E\u003Cstrong\u003EAbhishek Pillai\u003C\/strong\u003E\u003C\/a\u003E,\u0026nbsp;\u003Ca href=\u0022https:\/\/www.linkedin.com\/in\/rohan-bhasin-356aa41a0\/?originalSubdomain=in\u0022\u003E\u003Cstrong\u003ERohan Bhasin\u003C\/strong\u003E\u003C\/a\u003E,\u0026nbsp;\u003Ca href=\u0022https:\/\/www.linkedin.com\/in\/andrew-dibiasio-96164721a\/\u0022\u003E\u003Cstrong\u003EAndrew DiBiasio\u003C\/strong\u003E\u003C\/a\u003E,\u0026nbsp;\u003Ca href=\u0022https:\/\/www.linkedin.com\/in\/tarun-mandapati-a90443206\/\u0022\u003E\u003Cstrong\u003ETarun Mandapati\u003C\/strong\u003E\u003C\/a\u003E, and\u0026nbsp;\u003Ca href=\u0022https:\/\/www.linkedin.com\/in\/dhruv-adha-ba5142215\/\u0022\u003E\u003Cstrong\u003EDhruv Adha\u003C\/strong\u003E\u003C\/a\u003E. All six researchers are undergraduate students studying computer science.\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Ca href=\u0022https:\/\/www.scheller.gatech.edu\/directory\/faculty\/chava\/index.html\u0022\u003E\u003Cstrong\u003ESudheer Chava\u003C\/strong\u003E\u003C\/a\u003E co-advises Shah and is the faculty lead of SubjECTive-QA. Chava is a professor in the Scheller College of Business and director of the M.S. in Quantitative and Computational Finance (QCF) program.\u003C\/p\u003E\u003Cp\u003EChava is also an adjunct faculty member in the College of Computing\u2019s \u003Ca href=\u0022https:\/\/cse.gatech.edu\/\u0022\u003E\u003Cstrong\u003ESchool of Computational Science and Engineering (CSE)\u003C\/strong\u003E\u003C\/a\u003E.\u003C\/p\u003E\u003Cp\u003E\u0022Leading undergraduate students through the VIP Program taught me the powerful impact of balancing freedom with guidance,\u201d Shah said.\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u201cAllowing students to take the helm not only fosters their leadership skills but also enhances my own approach to mentoring, thus creating a mutually enriching educational experience.\u201d\u003C\/p\u003E\u003Cp\u003EPresenting SubjECTive-QA at NeurIPS 2024 exposes the dataset for further use and refinement. NeurIPS is one of three primary international conferences on high-impact research in AI and ML. The conference occurs Dec. 10-15.\u003C\/p\u003E\u003Cp\u003EThe SubjECTive-QA team is among the 162 Georgia Tech researchers presenting over 80 papers at NeurIPS 2024. The Georgia Tech contingent includes 46 faculty members, like Chava. These faculty represent Georgia Tech\u2019s Colleges of Business, Computing, Engineering, and Sciences, underscoring the pertinence of AI research across domains.\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u0022Presenting SubjECTive-QA at prestigious venues like NeurIPS propels our research into the spotlight, drawing the attention of key players in finance and tech,\u201d Shah said.\u003C\/p\u003E\u003Cp\u003E\u201cThe feedback we receive from this community of experts validates our approach and opens new avenues for future innovation, setting the stage for transformative applications in industry and academia.\u201d\u003C\/p\u003E","summary":"","format":"limited_html"}],"field_subtitle":"","field_summary":[{"value":"\u003Cp\u003EGeorgia Tech researchers have created a dataset that trains computer models to understand nuances in human speech during financial earnings calls. The dataset provides a new resource to study how public correspondence affects businesses and markets.\u0026nbsp;\u003C\/p\u003E\u003Cp\u003ESubjECTive-QA is the first human-curated dataset on question-answer pairs from earnings call transcripts (ECTs). The dataset teaches models to identify subjective features in ECTs, like clarity and cautiousness. \u0026nbsp;\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EThe dataset lays the foundation for a new approach to identifying disinformation and misinformation caused by nuances in speech. While ECT responses can be technically true, unclear or irrelevant information can misinform stakeholders and affect their decision-making.\u0026nbsp;\u003C\/p\u003E\u003Cp\u003ETests on White House press briefings showed that the dataset applies to other sectors with frequent question-and-answer encounters, notably politics, journalism, and sports. This increases the odds of effectively informing audiences and improving transparency across public spheres.\u0026nbsp; \u0026nbsp;\u003C\/p\u003E\u003Cp\u003EThe intersecting work between natural language processing and finance earned\u0026nbsp;\u003Ca href=\u0022https:\/\/arxiv.org\/pdf\/2410.20651\u0022\u003E\u003Cstrong\u003Ethe paper\u003C\/strong\u003E\u003C\/a\u003E acceptance to\u0026nbsp;\u003Ca href=\u0022https:\/\/neurips.cc\/\u0022\u003E\u003Cstrong\u003ENeurIPS 2024\u003C\/strong\u003E\u003C\/a\u003E, the 38th Annual Conference on Neural Information Processing Systems. NeurIPS is one of the world\u2019s most prestigious conferences on artificial intelligence (AI) and machine learning (ML) research.\u003C\/p\u003E","format":"limited_html"}],"field_summary_sentence":[{"value":"SubjECTive-QA is the first human-curated dataset on question-answer pairs from earnings call transcripts (ECTs). The dataset teaches models to identify subjective features in ECTs, like clarity and cautiousness.  "}],"uid":"36319","created_gmt":"2024-12-04 12:35:53","changed_gmt":"2024-12-04 21:24:01","author":"Bryant Wine","boilerplate_text":"","field_publication":"","field_article_url":"","location":"Atlanta, GA","dateline":{"date":"2024-12-03T00:00:00-05:00","iso_date":"2024-12-03T00:00:00-05:00","tz":"America\/New_York"},"extras":[],"hg_media":{"675766":{"id":"675766","type":"image","title":"SubjECTive Head Photo.jpg","body":null,"created":"1733315763","gmt_created":"2024-12-04 12:36:03","changed":"1733315763","gmt_changed":"2024-12-04 12:36:03","alt":"CSE NeurIPS 2024","file":{"fid":"259430","name":"SubjECTive Head Photo.jpg","image_path":"\/sites\/default\/files\/2024\/12\/04\/SubjECTive%20Head%20Photo.jpg","image_full_path":"http:\/\/hg.gatech.edu\/\/sites\/default\/files\/2024\/12\/04\/SubjECTive%20Head%20Photo.jpg","mime":"image\/jpeg","size":136969,"path_740":"http:\/\/hg.gatech.edu\/sites\/default\/files\/styles\/740xx_scale\/public\/2024\/12\/04\/SubjECTive%20Head%20Photo.jpg?itok=w8UTZ_0k"}},"675767":{"id":"675767","type":"image","title":"SubjECTive Group.jpg","body":null,"created":"1733315790","gmt_created":"2024-12-04 12:36:30","changed":"1733315790","gmt_changed":"2024-12-04 12:36:30","alt":"CSE NeurIPS 2024","file":{"fid":"259431","name":"SubjECTive Group.jpg","image_path":"\/sites\/default\/files\/2024\/12\/04\/SubjECTive%20Group.jpg","image_full_path":"http:\/\/hg.gatech.edu\/\/sites\/default\/files\/2024\/12\/04\/SubjECTive%20Group.jpg","mime":"image\/jpeg","size":78610,"path_740":"http:\/\/hg.gatech.edu\/sites\/default\/files\/styles\/740xx_scale\/public\/2024\/12\/04\/SubjECTive%20Group.jpg?itok=fOO_WR5k"}}},"media_ids":["675766","675767"],"related_links":[{"url":"https:\/\/www.cc.gatech.edu\/news\/new-dataset-takes-aim-subjective-misinformation-earnings-calls-and-other-public-hearings","title":"New Dataset Takes Aim at Subjective Misinformation in Earnings Calls and Other Public Hearings"}],"groups":[{"id":"47223","name":"College of Computing"},{"id":"1188","name":"Research Horizons"},{"id":"50877","name":"School of Computational Science and Engineering"}],"categories":[{"id":"139","name":"Business"},{"id":"131","name":"Economic Development and Policy"},{"id":"135","name":"Research"},{"id":"134","name":"Student and Faculty"},{"id":"8862","name":"Student Research"}],"keywords":[{"id":"10199","name":"Daily Digest"},{"id":"9153","name":"Research Horizons"},{"id":"187915","name":"go-researchnews"},{"id":"192863","name":"go-ai"},{"id":"167089","name":"Scheller College of Business"},{"id":"654","name":"College of Computing"},{"id":"166983","name":"School of Computational Science and Engineering"},{"id":"2556","name":"artificial intelligence"},{"id":"9167","name":"machine learning"},{"id":"191912","name":"Data Science at GT"},{"id":"5993","name":"quantitative and computational finance"},{"id":"190615","name":"Vertically Integrated Projects (VIP) Program"}],"core_research_areas":[{"id":"193655","name":"Artificial Intelligence at Georgia Tech"},{"id":"39431","name":"Data Engineering and Science"}],"news_room_topics":[],"event_categories":[],"invited_audience":[],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[{"value":"\u003Cp\u003EBryant Wine, Communications Officer\u003Cbr\u003E\u003Ca href=\u0022mailto:bryant.wine@cc.gatech.edu\u0022\u003Ebryant.wine@cc.gatech.edu\u003C\/a\u003E\u003C\/p\u003E","format":"limited_html"}],"email":[],"slides":[],"orientation":[],"userdata":""}}}