event

PhD Defense by Tian-Yi Zhou

Primary tabs

Title: Statistical Learning Theory of Deep Neural Networks: A Generalization Viewpoint

 

Date: April 4th (Friday)

Time: 1pm EST

Location: Groseclose 303 (or on Zoom: https://gatech.zoom.us/j/7710398345)

 

Tian-Yi Zhou

Machine Learning PhD Student

H. Milton Stewart School of Industrial and Systems Engineering (ISyE)
Georgia Institute of Technology

 

Committee

1 Dr. Xiaoming Huo(Advisor, ISyE)

2 Dr. Vladimir Koltchinskii (School of Math)

3 Dr. Wenjing Liao (School of Math)

4 Dr. Guang Cheng (UCLA, Department of Statistics & Data Science)

5 Dr. Sung Ha Kang (School of Math)

 

Abstract

This thesis investigates the mathematical foundations of deep learning, focusing on the statistical guarantees of deep neural networks in regression, classification, and anomaly detection. Specifically, it seeks to understand when and how neural networks effectively generalize to unseen data in these tasks. By uncovering the statistical and computational mechanisms that drive the success of deep learning, this thesis aims to further improve the robustness and accuracy of systems that rely on these technologies.

Chapter 2 of the thesis studies the phenomenon of benign overfitting among convolutional neural networks (CNN), an important class of neural network

designed to efficiently learn spatial hierarchies of features. It demonstrates that the generalization rate of a CNN architecture remains unchanged, even with a substantial increase in model and parameter sizes. Chapter 3 studies the classification of unbounded data generated from Gaussian Mixture Models using fully-connected neural network. We obtain — for the first time — non-asymptotic upper bounds and convergence rates of the excess risk without restrictions on model parameters. Chapter 4 develops a mathematical framework and theory-grounded tools for unsupervised anomaly detection, with a focus on its practice in cybersecurity. It establishes the first optimality result for anomaly detection, and quantifies the amount of synthetic anomalies needed to achieve high accuracy. Finally, chapter 5 explores the use of deep learning in functional data analysis, focusing on the approximation of nonlinear functionals mapping from a reproducing kernel Hilbert space to \RR. 

Status

  • Workflow Status:Published
  • Created By:Tatianna Richardson
  • Created:03/31/2025
  • Modified By:Tatianna Richardson
  • Modified:03/31/2025

Categories

Keywords

Target Audience