Machine Learning Center Seminar Series | SDG and Weight Decay Secretly Compress Your Neural Network

Featuring Dr. Tomer Galanti, Massachusetts Institute of Technology

Abstract: Several empirical results have shown that replacing weight matrices with low-rank approximations results in only a small drop in accuracy, suggesting that the weight matrices at convergence may be close to low-rank matrices.

In this talk, we will study the origins of the bias of Stochastic Gradient Descent (SGD) to learn low-rank weight matrices when training Leaky ReLU neural networks. Our results show that training neural networks with SGD and weight decay causes a bias towards rank minimization over the weight matrices. Specifically, we show, both theoretically and empirically, that this bias is more pronounced when using smaller batch sizes, higher learning rates, or increased weight decay. Unlike previous literature, our analysis does not rely on assumptions about the data, convergence, or optimality of the weight matrices and applies to a wide range of neural network architectures of any width or depth. Finally, we will discuss the relationship between our analysis and other related properties, like sparsity, neural collapse, implicit regularization, generalization and compression.

Joint work with Zachary Siegel, Aparna Gupte, and Tomaso Poggio.

Bio: Tomer Galanti is a Postdoctoral Associate in Prof. Poggio's lab at MIT, where he works on theoretical and algorithmic aspects of deep learning. He previously interned as a Research Scientist with DeepMind's Foundations team. He received his Ph.D. from Tel Aviv University and was the university's youngest Ph.D. graduate in 2022. He received the Deutch Annual Prize in Computer Science in 2018 for his Ph.D. He published numerous papers at top-tier venues, such as NeurIPS, ICLR, ICML, ICCV, ECCV, and JMLR, including an oral presentation paper at NeurIPS 2020.

Media

No media selected

Summary

Machine Learning Center Seminar Series is held bi-weekly on Wednesdays at 12pm.

Details

Wednesday

Sep 27 2023

12:00pm - 01:00pm

Location: CODA Building 9th Floor Atrium

Contact: Shelli Hatcher Program and Operations Manager

URL: https://coda.gatech.edu/

Extras: Free food

In campus calendar: No

Sidebar Content

No sidebar content

Groups

ML@GT

Status

Workflow Status:Published
Created By:Joshua Preston
Created:09/21/2023
Modified By:shatcher8
Modified:09/21/2023

Mercury (Hg)

Machine Learning Center Seminar Series | SDG and Weight Decay Secretly Compress Your Neural Network

Log in

Georgia Institute of Technology

Machine Learning Center Seminar Series | SDG and Weight Decay Secretly Compress Your Neural Network

Primary tabs

Log in

Georgia Institute of Technology