PhD Defense by Guan-Horng Liu

Title: Large-Scale Optimization for Deep Neural Network Architecture: A Dynamical System Theory Perspective

Date: Wednesday, June 26th, 2024

Time: 1:00 - 3:00 pm EST (6-8pm London Time)

Location/Remote link: Coda C0915 Atlantic https://gatech.zoom.us/j/3392051118?omn=97259773696

Guan-Horng Liu

Machine Learning PhD Student

School of Aerospace Engineering

Georgia Institute of Technology

Committee

1. Dr. Evangelos Theodorou (School of Aerospace Engineering, Georgia Tech; Advisor)

2. Dr. Molei Tao (School of Mathematics, Georgia Tech)

3. Dr. Yao Xie (School of Industrial and Systems Engineering, Georgia Tech)

4. Dr. Justin Romberg (School of Electrical and Computer Engineering, Georgia Tech)

5. Dr. Arnaud Doucet (Department of Statistics, University of Oxford; Google DeepMind)

Abstract

Optimization of deep neural networks (DNNs) has been a driving force in the advancement of modern artificial intelligence. Despite efforts to design DNN architectures that leverage domain-specific knowledge, the development of optimization algorithms has often progressed independently of architectural innovations. This thesis delves into large-scale optimization methods that leverage the underlying deep architectural structures being optimized. Specifically, we demonstrate that the dynamical system and optimal control theory pave a profound foundation for algorithmic characterization in learning various deep architectures, including standard DNNs, Neural ODEs, and SDEs such as diffusion models/bridges.

Optimal control, in its broadest sense, examines the principle of optimization over dynamical systems. This methodological perspective naturally arises in training neural differential equations and can be applied to standard DNNs, with Backpropagation emerging as an approximate dynamic programming. Through development, we emphasize the significance of control-theoretic components such as differential programming and nonlinear Feynman-Kac, unifying existing optimization methods and extending them to handle a broader class of complex dynamics and problem setups that may otherwise be hard to adapt or foresee. The developed methods have been applied to large-scale applications such as image generation, restoration, translation, as well as solving mean-field games and opinion modeling.

Media

No media selected

Summary

Large-Scale Optimization for Deep Neural Network Architecture: A Dynamical System Theory Perspective

Details

Wednesday

Jun 26 2024

01:00pm - 03:00pm

Location: Coda C0915 Atlantic

In campus calendar: No

Sidebar Content

No sidebar content

Groups

Graduate Studies

Status

Workflow Status:Published
Created By:Tatianna Richardson
Created:06/24/2024
Modified By:Tatianna Richardson
Modified:06/24/2024

Mercury (Hg)