PhD Proposal by Cyrille Combettes
Title: Frank-Wolfe Methods for Optimization and Machine Learning
Date: Thursday, December 10, 2020
Time: 2pm EST
Ph.D. student in Machine Learning
School of Industrial and Systems Engineering
Georgia Institute of Technology
Prof. Sebastian Pokutta (advisor), Institute of Mathematics, Technische Universität Berlin and Department for AI in Society, Science, and Technology, Zuse Institute Berlin
Prof. Swati Gupta, School of Industrial and Systems Engineering, Georgia Institute of Technology
Prof. Guanghui (George) Lan, School of Industrial and Systems Engineering, Georgia Institute of Technology
The Frank-Wolfe algorithm (FW) has become a popular constrained optimization algorithm for it is simple and projection-free, and it has been successfully applied to problems in traffic assignment, matrix completion, computer vision, optimal transport, and many others. Its main drawback however lies in its convergence rate, which can be excessively slow due to naive descent directions. In the first part of the proposal, we address this issue by designing a boosting procedure generating descent directions better aligned with the negative gradients while still preserving the projection-free property. Although our method is reasonably simple and intuitive, it produces very significant results. We derive sublinear to linear convergence rates and demonstrate computational speedups over the state of the art on a wide variety of experiments.
In the second part of the proposal, we consider the large-scale finite-sum setting arising in many tasks of machine learning. We improve the quality of the first-order information accessed by stochastic FW algorithms by blending in adaptive gradients using a sliding technique. We derive sublinear convergence rates and demonstrate that the blend of the two methods is successful via computational experiments on convex and nonconvex objectives. In particular, our method is the only one to improve the performance of the vanilla stochastic FW on the training of neural networks.
Both previous contributions are motivated by the projection-free property of FW. In the last part of the proposal, we leverage the natural sparsity of the iterates of FW to study an application to the approximate Carathéodory problem. We show that FW generates a simple solution to the problem and that with no modification of the algorithm, better cardinality bounds can be established using existing convergence analyses of FW in different scenarios.
- Workflow Status: Published
- Created By: Tatianna Richardson
- Created: 12/10/2020
- Modified By: Tatianna Richardson
- Modified: 12/10/2020