Zero-Order and Second-Order Optimization Methods with Applications in Machine Learning

Jorge Nocedal

Northwestern University

We begin by proposing an optimization algorithm that employs only function values and is able to solve noisy problems in thousands of variables. We then consider the problem of training deep neural networks and note that although most high-dimensional nonconvex optimization problems cannot be solved to optimality, deep neural networks have a benign geometry that allows stochastic optimization methods find acceptable solutions. There are, nevertheless, many open questions concerning the optimization process, including trade-offs between parallelism and the predictive ability of solutions. We discuss classical and new optimization methods in the light of these observations.

-------

Bio

Jorge Nocedal is the Walter P. Murphy Professor in the Department of Industrial Engineering and Management Sciences at Northwestern University. His research is in optimization, both deterministic and stochastic, and with emphasis on very large-scale problems. His current work is driven by applications in machine learning. He is a SIAM Fellow, was awarded the 2012 George B. Dantzig Prize and the 2017 Von Neumann Theory Prize for contributions to theory and algorithms of nonlinear optimization.