event
PhD Defense by Tian-Yi Zhou
Primary tabs
Title: Statistical Learning Theory of Deep Neural Networks: A Generalization Viewpoint
Date: April 4th (Friday)
Time: 1pm EST
Location: Groseclose 303 (or on Zoom: https://gatech.zoom.us/j/7710398345)
Tian-Yi Zhou
Machine Learning PhD Student
H. Milton Stewart School of Industrial and Systems Engineering (ISyE)
Georgia Institute of Technology
Committee
1 Dr. Xiaoming Huo(Advisor, ISyE)
2 Dr. Vladimir Koltchinskii (School of Math)
3 Dr. Wenjing Liao (School of Math)
4 Dr. Guang Cheng (UCLA, Department of Statistics & Data Science)
5 Dr. Sung Ha Kang (School of Math)
Abstract
This thesis investigates the mathematical foundations of deep learning, focusing on the statistical guarantees of deep neural networks in regression, classification, and anomaly detection. Specifically, it seeks to understand when and how neural networks effectively generalize to unseen data in these tasks. By uncovering the statistical and computational mechanisms that drive the success of deep learning, this thesis aims to further improve the robustness and accuracy of systems that rely on these technologies.
Chapter 2 of the thesis studies the phenomenon of benign overfitting among convolutional neural networks (CNN), an important class of neural network
designed to efficiently learn spatial hierarchies of features. It demonstrates that the generalization rate of a CNN architecture remains unchanged, even with a substantial increase in model and parameter sizes. Chapter 3 studies the classification of unbounded data generated from Gaussian Mixture Models using fully-connected neural network. We obtain — for the first time — non-asymptotic upper bounds and convergence rates of the excess risk without restrictions on model parameters. Chapter 4 develops a mathematical framework and theory-grounded tools for unsupervised anomaly detection, with a focus on its practice in cybersecurity. It establishes the first optimality result for anomaly detection, and quantifies the amount of synthetic anomalies needed to achieve high accuracy. Finally, chapter 5 explores the use of deep learning in functional data analysis, focusing on the approximation of nonlinear functionals mapping from a reproducing kernel Hilbert space to \RR.
Groups
Status
- Workflow Status:Published
- Created By:Tatianna Richardson
- Created:03/31/2025
- Modified By:Tatianna Richardson
- Modified:03/31/2025
Categories
Keywords
Target Audience