TRIAD Lecture Series by Yuxin Chen from Princeton (2/5)

Primary tabs

This is one of a series of talks that are given by Professor Chen. The full list of his talks is as follows:
Wednesday, August 28, 2019; 11:00 am - 12:00 pm; Groseclose 402
Thursday, August 29, 2019; 11:00 am - 12:00 pm; Groseclose 402
Tuesday, September 3, 2019; 11:00 am - 12:00 pm; Main - Executive Education Room 228
Wednesday, September 4, 2019; 11:00 am - 12:00 pm; Main - Executive Education Room 228
Thursday, September 5, 2019; 11:00 am - 12:00 pm; Groseclose 402

Check https://triad.gatech.edu/events for more information. 
For location information, please check https://isye.gatech.edu/about/maps-directions/isye-building-complex

Title of this talk: Random initialization and implicit regularization in nonconvex statistical estimation

Abstract: Recent years have seen a flurry of activities in designing provably efficient nonconvex procedures for solving statistical estimation/learning problems. Due to the highly nonconvex nature of the empirical loss, state-of-the-art procedures often require suitable initialization and proper regularization (e.g., trimming, regularized cost, projection) in order to guarantee fast convergence. For vanilla procedures such as gradient descent, however, the prior theory is often either far from optimal or completely lacks theoretical

This talk is concerned with a striking phenomenon arising in two nonconvex problems (i.e. phase retrieval and matrix completion): even in the absence of careful initialization, proper saddle escaping, and/or explicit regularization, gradient descent converges to the optimal solution within a logarithmic number of iterations, thus achieving near-optimal statistical and computational guarantees at once. All of this is achieved by exploiting the statistical models in analyzing optimization algorithms, via a leave-one-out approach that enables the decoupling of certain statistical dependency between the gradient descent iterates and the data. As a byproduct, for noisy matrix completion, we demonstrate that gradient descent achieves near-optimal entrywise error control.

This is joint work with Cong Ma, Kaizheng Wang, Yuejie Chi, and Jianqing Fan



  • Workflow Status: Published
  • Created By: Xiaoming Huo
  • Created: 08/25/2019
  • Modified By: Xiaoming Huo
  • Modified: 08/29/2019