Complexity Issues and Randomization Strategies in Frank-Wolfe Algorithms for Machine Learning

Frank-Wolfe algorithms for convex minimization have recently gained considerable attention from the Optimization and Machine Learning communities, as their properties make them a suitable choice in a variety of applications. However, as each iteration requires to optimize a linear model, a clever implementation is crucial to make such algorithms viable on large-scale datasets. For this purpose, approximation strategies based on a random sampling have been proposed by several researchers. In this work, we perform an experimental study on the effectiveness of these techniques, analyze possible alternatives and provide some guidelines based on our results.

[1]  Bernhard Schölkopf,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[2]  Martin Jaggi,et al.  Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[3]  Zaïd Harchaoui,et al.  Conditional gradient algorithms for norm-regularized smooth convex optimization , 2013, Math. Program..

[4]  Johan A. K. Suykens,et al.  Hybrid algorithms with applications to sparse and low rank regularization , 2014 .

[5]  Claudio Sartori,et al.  A New Algorithm for Training SVMs Using Approximate Minimal Enclosing Balls , 2010, CIARP.

[6]  Kenneth L. Clarkson,et al.  Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm , 2008, SODA '08.

[7]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[8]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[9]  Claudio Sartori,et al.  Training Support Vector Machines using Frank-Wolfe Optimization Methods , 2012, Int. J. Pattern Recognit. Artif. Intell..

[10]  S. Sathiya Keerthi,et al.  A fast iterative nearest point algorithm for support vector machine classifier design , 2000, IEEE Trans. Neural Networks Learn. Syst..

[11]  Claudio Sartori,et al.  A novel Frank-Wolfe algorithm. Analysis and applications to large-scale SVM training , 2013, Inf. Sci..

[12]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[13]  Piyush Kumar,et al.  A Linearly Convergent Linear-Time First-Order Algorithm for Support Vector Classification with a Core Set Result , 2011, INFORMS J. Comput..

[14]  Mark W. Schmidt,et al.  Block-Coordinate Frank-Wolfe Optimization for Structural SVMs , 2012, ICML.

[15]  Martin Jaggi,et al.  An Affine Invariant Linear Convergence Analysis for Frank-Wolfe Algorithms , 2013, 1312.7864.

[16]  Giampaolo Liuzzi,et al.  Solving $$\ell _0$$ℓ0-penalized problems with simple constraints via the Frank–Wolfe reduced dimension method , 2015, Optim. Lett..

[17]  Patrice Marcotte,et al.  Some comments on Wolfe's ‘away step’ , 1986, Math. Program..

[18]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.