Practical Frank-Wolfe algorithms

In the last decade there has been a resurgence of interest in Frank-Wolfe (FW) style methods for optimizing a smooth convex function over a polytope. Examples of recently developed techniques include {\em Decomposition-invariant Conditional Gradient} (DiCG), {\em Blended Condition Gradient} (BCG), and {\em Frank-Wolfe with in-face directions} (IF-FW) methods. We introduce two extensions of these techniques. First, we augment DiCG with the {\em working set} strategy, and show how to optimize over the working set using {\em shadow simplex steps}. Second, we generalize in-face Frank-Wolfe directions to polytopes in which faces cannot be efficiently computed, and also describe a generic recursive procedure that can be used in conjunction with several FW-style techniques. Experimental results indicate that these extensions are capable of speeding up original algorithms by orders of magnitude for certain applications.

[1]  Xinhua Zhang,et al.  Decomposition-Invariant Conditional Gradient for General Polytopes with Line Search , 2017, NIPS.

[2]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[3]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[4]  Shimrit Shtern,et al.  Linearly convergent away-step conditional gradient for non-strongly convex functions , 2015, Mathematical Programming.

[5]  Wei Ping,et al.  Learning Infinite RBMs with Frank-Wolfe , 2017, NIPS.

[6]  G. McCormick,et al.  The Gradient Projection Method under Mild Differentiability Conditions , 1972 .

[7]  Larry J. LeBlanc,et al.  AN EFFICIENT APPROACH TO SOLVING THE ROAD NETWORK EQUILIBRIUM TRAFFIC ASSIGNMENT PROBLEM. IN: THE AUTOMOBILE , 1975 .

[8]  Ohad Shamir,et al.  Large-Scale Convex Minimization with a Low-Rank Constraint , 2011, ICML.

[9]  Paul Grigas,et al.  An Extended Frank-Wolfe Method with "In-Face" Directions, and Its Application to Low-Rank Matrix Completion , 2015, SIAM J. Optim..

[10]  Sebastian Pokutta,et al.  Walking in the Shadow: A New Perspective on Descent Directions for Constrained Minimization , 2020, NeurIPS.

[11]  Ofer Meshi,et al.  Linear-Memory and Decomposition-Invariant Linearly Convergent Conditional Gradient Algorithm for Structured Polytopes , 2016, NIPS.

[12]  Sebastian Pokutta,et al.  Lazifying Conditional Gradient Algorithms , 2017, ICML.

[13]  Martin Jaggi,et al.  On the Global Linear Convergence of Frank-Wolfe Optimization Variants , 2015, NIPS.

[14]  Fei-Fei Li,et al.  Efficient Image and Video Co-localization with Frank-Wolfe Algorithm , 2014, ECCV.

[15]  Martin Jaggi,et al.  Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[16]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[17]  Christoph H. Lampert,et al.  A multi-plane block-coordinate frank-wolfe algorithm for training structural SVMs with a costly max-oracle , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Sebastian Pokutta,et al.  Blended Conditional Gradients: the unconditioning of conditional gradients , 2018, ICML 2019.

[19]  Sebastian Pokutta,et al.  Second-order Conditional Gradient Sliding. , 2020 .

[20]  Mark W. Schmidt,et al.  Block-Coordinate Frank-Wolfe Optimization for Structural SVMs , 2012, ICML.

[21]  Vladimir Kolmogorov,et al.  MAP Inference via Block-Coordinate Frank-Wolfe Algorithm , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Patrice Marcotte,et al.  Some comments on Wolfe's ‘away step’ , 1986, Math. Program..

[23]  Sebastian Pokutta,et al.  Second-order Conditional Gradients , 2020, ArXiv.

[24]  Anton Osokin,et al.  Minding the Gaps for Block Frank-Wolfe Optimization of Structured SVMs , 2016, ICML.

[25]  Cyrille W. Combettes,et al.  Boosting Frank-Wolfe by Chasing Gradients , 2020, ICML.

[26]  Vladimir Kolmogorov,et al.  Blossom V: a new implementation of a minimum cost perfect matching algorithm , 2009, Math. Program. Comput..