ML@GT Seminar Series | Lessons from Pre-training Llama 3

Featuring Mike Lewis, Facebook AI Research

Abstract: Large language models have revolutionized artificial intelligence, but many details of their creation remain shrouded in mystery due to their cost and commercial value. I will describe the pre-training of Llama 3, a highly competitive open model. Research in pre-training is challenging, due to the need to accurately make many decisions based on ablations at a scale orders of magnitude below the final model size. However, this project demonstrates that a state-of-the-art model can be created with a surprisingly simple recipe, based around carefully optimizing data curation, building efficient infrastructure, and minimizing complexity elsewhere. I will also contrast life in a large pre-training research team with more academic projects, and discuss outstanding research questions in the field.

Bio: Mike Lewis is a research scientist at Meta, currently leading pre-training research for the Llama models. Research interests include pre-training language models (e.g. Llama 3, Bart and Roberta), retrieval augmentation (e.g. kNN-LM and RAG) and negotiation dialogue agents (such as the Cicero Diplomacy model). Previously he was a postdoc at the University of Washington (working with Luke Zettlemoyer), and has a PhD from the University of Edinburgh (advised by Mark Steedman). He received a Best Paper Award at EMNLP 2016, Best Resource Paper at ACL 2017, and Best Paper Honourable Mention at ACL 2018. His work has been extensively covered in the media, with varying levels of accuracy.

Media

ML@GT Seminar Series hosts Mike Lewis on Wednesday, November 20 at 12pm

2024.1120 ML Seminar Announcement-Mike Lewis.jpg

Summary

Machine Learning Center Seminar Series is held bi-weekly on Wednesdays at 12pm.

Details

Wednesday

Nov 20 2024

12:00pm - 01:00pm

Location: CODA 9th Floor Atrium

Contact: Shelli Hatcher, Program and Operations Manager

URL: https://coda.gatech.edu/

Extras: Free food

In campus calendar: No

Sidebar Content

No sidebar content

Groups

ML@GT

Status

Workflow Status:Published
Created By:shatcher8
Created:09/17/2024
Modified By:shatcher8
Modified:11/15/2024

Mercury (Hg)

ML@GT Seminar Series | Lessons from Pre-training Llama 3

Log in

Georgia Institute of Technology

ML@GT Seminar Series | Lessons from Pre-training Llama 3

Primary tabs

Log in

Georgia Institute of Technology