{"671416":{"#nid":"671416","#data":{"type":"event","title":"ML@GT Seminar Series | Efficient \u0026 Scalable NLP through Retrieval-Augmented Language Models","body":[{"value":"\u003Cp\u003E\u003Cstrong\u003EFeaturing Scott Yih, Facebook AI Research (FAIR)\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003E\u003Cspan\u003E\u003Cspan\u003E\u003Cspan\u003EAbstract:\u0026nbsp;\u003C\/span\u003E\u003C\/span\u003E\u003C\/span\u003E\u003C\/strong\u003E\u003Cspan\u003E\u003Cspan\u003E\u003Cspan\u003EWhile large-scale language models work incredibly well, it is expensive to train them, difficult to explain their predictions, and nearly impossible to keep them current over time. It is unclear when we can trust their predictions, and none of the current large language models can answer questions about current topics, such as COVID-19, since the corpora used for their training were created several years ago. To develop the next generation of general-purpose language models with smaller, simpler, and much more efficient models, we believe information retrieval is a key component. When interacting with each other and with the world, humans tap into many different forms of knowledge, including world knowledge (e.g., commonsense, updated world facts, trending news) and user knowledge (e.g., conversational memory, social interactions, additional context such as location, etc.). To incorporate this capability in AI applications, information retrieval provides models access to (potentially large) collections of documents that can contain such knowledge. Specifically, we envision that the complete system consists of a small, core model that can easily access additional, task-related knowledge via retrieval, and perform comparably to the largest language models available today.\u0026nbsp;\u003C\/span\u003E\u003C\/span\u003E\u003C\/span\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cspan\u003E\u003Cspan\u003E\u003Cspan\u003EIn this\u0026nbsp;talk, I will first give a research overview of retrieval-augmented language models. Then, I will share some of our recent work, including a general framework that improves any language models by adding a retrieval component, as well as how we apply instruction tuning to both the language model and retrieval system to further increase the gain. Finally, I\u0027ll conclude the\u0026nbsp;talk\u0026nbsp;by discussing some of the lessons we learned and the problems we plan to address in the near future.\u003C\/span\u003E\u003C\/span\u003E\u003C\/span\u003E\u003Cspan\u003E\u003Cspan\u003E\u0026nbsp;\u003C\/span\u003E\u003C\/span\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003E\u003Cspan\u003E\u003Cspan\u003E\u003Cspan\u003EBio:\u0026nbsp;\u003C\/span\u003E\u003C\/span\u003E\u003C\/span\u003E\u003C\/strong\u003E\u003Cspan\u003E\u003Cspan\u003E\u003Cspan\u003EScott Wen-tau Yih is a Research Scientist at FAIR, Meta. His research interests include natural language processing, machine learning and information retrieval. Before joining Meta, Yih was a Principal Research Scientist at the Allen Institute for Artificial Intelligence (AI2), working on scientific question answering. Prior to that, Yih had spent 12 years at Microsoft Research, working on a variety of projects including email spam filtering, keyword extraction and search \u0026amp; ad relevance. His recent work focuses on continuous representations and neural models for question answering and retrieval; some of his well-known work includes WikiQA, RAG and DPR. Yih received the best paper award from CoNLL\u201911, an outstanding paper award from ACL\u201915 and has served as program co-chairs (CEAS\u201909, CoNLL\u201914, EMNLP\u201921) and senior area chairs for NLP (ACL, NNACL, EMNLP, EACL) and ML (ICLR, NeurIPS) conferences.\u0026nbsp; \u003C\/span\u003E\u003C\/span\u003E\u003C\/span\u003E\u003C\/p\u003E\r\n","summary":"","format":"limited_html"}],"field_subtitle":"","field_summary":[{"value":"\u003Cp\u003EMachine Learning Center Seminar Series is held bi-weekly on Wednesdays at 12pm.\u0026nbsp;\u003C\/p\u003E\r\n","format":"limited_html"}],"field_summary_sentence":[{"value":"Featuring Scott Yih, Facebook AI Research (FAIR)"}],"uid":"36518","created_gmt":"2023-12-05 15:11:00","changed_gmt":"2024-01-09 17:28:28","author":"shatcher8","boilerplate_text":"","field_publication":"","field_article_url":"","field_event_time":{"event_time_start":"2024-01-17T12:00:00-05:00","event_time_end":"2024-01-17T13:00:00-05:00","event_time_end_last":"2024-01-17T13:00:00-05:00","gmt_time_start":"2024-01-17 17:00:00","gmt_time_end":"2024-01-17 18:00:00","gmt_time_end_last":"2024-01-17 18:00:00","rrule":null,"timezone":"America\/New_York"},"location":"CODA 9th Floor Atrium","extras":["free_food"],"related_links":[{"url":"https:\/\/ml.gatech.edu\/","title":""}],"groups":[{"id":"576481","name":"ML@GT"}],"categories":[],"keywords":[{"id":"173555","name":"Center for Machine Learning"},{"id":"9167","name":"machine learning"}],"core_research_areas":[],"news_room_topics":[],"event_categories":[{"id":"1795","name":"Seminar\/Lecture\/Colloquium"}],"invited_audience":[{"id":"78761","name":"Faculty\/Staff"},{"id":"177814","name":"Postdoc"},{"id":"174045","name":"Graduate students"},{"id":"78751","name":"Undergraduate students"}],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[{"value":"\u003Cp\u003EShelli Hatcher, Program and Operations Manager\u003C\/p\u003E\r\n","format":"limited_html"}],"email":[],"slides":[],"orientation":[],"userdata":""}}}