{"678710":{"#nid":"678710","#data":{"type":"event","title":"ML@GT Seminar Series | Parameter-Efficient Training of Large Language Models","body":[{"value":"\u003Cp\u003EFeaturing Anna Rumshisky,\u0026nbsp;University of Massachusetts Lowell\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EAbstract: \u003C\/strong\u003EOver the past few years, scaling large language models (LLMs) has become the go-to AI solution for solving progressively more complicated tasks. However, as the models scaled towards hundreds of billions of parameters, not just training them from scratch, but even fine-tuning to adapt to specific tasks has become computationally expensive and unmanageable for most practitioners. This has led to a crisis, in which only a few well-funded industry labs have the resources to develop high-quality LLMs. In response to these developments, numerous parameter-efficient techniques for fine-tuning large language models have been developed, revolutionizing the accessibility of fine-tuning LLMs. However, until recently, such methods were not available for pre-training.\u003Cbr\u003E\u003Cbr\u003EIn this talk, I will present our recent work on ReLoRA, the first parameter-efficient method for training models from scratch, which utilizes low-rank updates to train high-rank networks, with demonstrated results for models with up to 1.3 billion parameters. I will also include a brief overview and taxonomy of different parameter-efficient training (PET) methods that have been developed over the past two years. I will discuss the advantages and limitations of different approaches to PET with respect to parameter and memory efficiency, as well as training speed and inference throughput.\u003Cbr\u003E\u003Cbr\u003E\u003Cstrong\u003EBio: \u003C\/strong\u003EAnna Rumshisky is an Associate Professor of Computer Science at the University of Massachusetts Lowell, where she leads the Text Machine Lab for NLP.\u0026nbsp; Her primary research area is artificial intelligence and large language model (LLM) research, with a focus on efficient large model training and model analysis and interpretability. She holds a joint appointment as an Amazon Scholar at Amazon AGIF (Artificial General Intelligence Foundations), where she helps drive the scientific efforts behind large-scale foundational LLM training in an industry setting. She was previously a postdoctoral fellow at MIT CSAIL and a PhD in Computer Science from Brandeis University.\u0026nbsp; She is a recipient of the NSF CAREER award in 2017 and the best thematic paper award at NAACL-HLT 2019. Her research has been funded by the NSF, NIH, Army Research Office, among others.\u003C\/p\u003E","summary":"","format":"limited_html"}],"field_subtitle":"","field_summary":[{"value":"\u003Cp\u003EMachine Learning Center Seminar Series is held bi-weekly on Wednesdays at 12pm.\u0026nbsp;\u003C\/p\u003E","format":"limited_html"}],"field_summary_sentence":[{"value":"Featuring Anna Rumshisky, University of Massachusetts Lowell"}],"uid":"36518","created_gmt":"2024-12-02 14:40:34","changed_gmt":"2025-04-03 17:14:58","author":"shatcher8","boilerplate_text":"","field_publication":"","field_article_url":"","field_event_time":{"event_time_start":"2025-04-16T12:00:00-04:00","event_time_end":"2025-04-16T13:00:00-04:00","event_time_end_last":"2025-04-16T13:00:00-04:00","gmt_time_start":"2025-04-16 16:00:00","gmt_time_end":"2025-04-16 17:00:00","gmt_time_end_last":"2025-04-16 17:00:00","rrule":null,"timezone":"America\/New_York"},"location":"CODA 9th Floor Atrium","extras":["free_food"],"hg_media":{"676746":{"id":"676746","type":"image","title":"2025.0416-ML-Seminar-Announcement-Anna-Rumshisky.jpg","body":null,"created":"1743700451","gmt_created":"2025-04-03 17:14:11","changed":"1743700451","gmt_changed":"2025-04-03 17:14:11","alt":"ML@GT Seminar Series hosts Anna Rumshisky on Wednesday, April 16 at 12pm. ","file":{"fid":"260574","name":"2025.0416-ML-Seminar-Announcement-Anna-Rumshisky.jpg","image_path":"\/sites\/default\/files\/2025\/04\/03\/2025.0416-ML-Seminar-Announcement-Anna-Rumshisky.jpg","image_full_path":"http:\/\/hg.gatech.edu\/\/sites\/default\/files\/2025\/04\/03\/2025.0416-ML-Seminar-Announcement-Anna-Rumshisky.jpg","mime":"image\/jpeg","size":161614,"path_740":"http:\/\/hg.gatech.edu\/sites\/default\/files\/styles\/740xx_scale\/public\/2025\/04\/03\/2025.0416-ML-Seminar-Announcement-Anna-Rumshisky.jpg?itok=yW0uNXF2"}}},"media_ids":["676746"],"groups":[{"id":"576481","name":"ML@GT"}],"categories":[],"keywords":[{"id":"9167","name":"machine learning"}],"core_research_areas":[],"news_room_topics":[],"event_categories":[{"id":"1795","name":"Seminar\/Lecture\/Colloquium"}],"invited_audience":[{"id":"78761","name":"Faculty\/Staff"},{"id":"177814","name":"Postdoc"},{"id":"174045","name":"Graduate students"},{"id":"78751","name":"Undergraduate students"}],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[{"value":"\u003Cp\u003EShelli Hatcher, Program and Operations Manager\u003C\/p\u003E","format":"limited_html"}],"email":[],"slides":[],"orientation":[],"userdata":""}}}