PhD Defense by Tian-Yi Zhou

Title: Statistical Learning Theory of Deep Neural Networks: A Generalization Viewpoint

Date: April 4th (Friday)

Time: 1pm EST

Location: Groseclose 303 (or on Zoom: https://gatech.zoom.us/j/7710398345)

Tian-Yi Zhou

Machine Learning PhD Student

H. Milton Stewart School of Industrial and Systems Engineering (ISyE)
Georgia Institute of Technology

Committee

1 Dr. Xiaoming Huo(Advisor, ISyE)

2 Dr. Vladimir Koltchinskii (School of Math)

3 Dr. Wenjing Liao (School of Math)

4 Dr. Guang Cheng (UCLA, Department of Statistics & Data Science)

5 Dr. Sung Ha Kang (School of Math)

Abstract

This thesis investigates the mathematical foundations of deep learning, focusing on the statistical guarantees of deep neural networks in regression, classification, and anomaly detection. Specifically, it seeks to understand when and how neural networks effectively generalize to unseen data in these tasks. By uncovering the statistical and computational mechanisms that drive the success of deep learning, this thesis aims to further improve the robustness and accuracy of systems that rely on these technologies.

Chapter 2 of the thesis studies the phenomenon of benign overfitting among convolutional neural networks (CNN), an important class of neural network

designed to efficiently learn spatial hierarchies of features. It demonstrates that the generalization rate of a CNN architecture remains unchanged, even with a substantial increase in model and parameter sizes. Chapter 3 studies the classification of unbounded data generated from Gaussian Mixture Models using fully-connected neural network. We obtain — for the first time — non-asymptotic upper bounds and convergence rates of the excess risk without restrictions on model parameters. Chapter 4 develops a mathematical framework and theory-grounded tools for unsupervised anomaly detection, with a focus on its practice in cybersecurity. It establishes the first optimality result for anomaly detection, and quantifies the amount of synthetic anomalies needed to achieve high accuracy. Finally, chapter 5 explores the use of deep learning in functional data analysis, focusing on the approximation of nonlinear functionals mapping from a reproducing kernel Hilbert space to \RR.

Media

No media selected

Summary

Statistical Learning Theory of Deep Neural Networks: A Generalization Viewpoint

Details

Friday

Apr 4 2025

01:00pm - 03:00pm

Location: Groseclose 303 or on Zoom

In campus calendar: No

Sidebar Content

No sidebar content

Groups

Graduate Studies

Status

Workflow Status:Published
Created By:Tatianna Richardson
Created:03/31/2025
Modified By:Tatianna Richardson
Modified:03/31/2025

Mercury (Hg)

PhD Defense by Tian-Yi Zhou

Log in

Georgia Institute of Technology

PhD Defense by Tian-Yi Zhou

Primary tabs

Log in

Georgia Institute of Technology