论文信息 - Convergence Analysis of a Stochastic Projection-free Algorithm

Convergence Analysis of a Stochastic Projection-free Algorithm

This paper presents and analyzes a stochastic version of the Frank-Wolfe algorithm (a.k.a. conditional gradient method or projection-free algorithm) for constrained convex optimization. We first prove that when the quality of gradient estimate improves as ${\cal O}( \sqrt{ \eta_t^{\Delta} / t } )$, where $t$ is the iteration index and $\eta_t^{\Delta}$ is an increasing sequence, then the objective value of the stochastic Frank-Wolfe algorithm converges in at least the same order. When the optimal solution lies in the interior of the constraint set, the convergence rate is accelerated to ${\cal O}(\eta_t^{\Delta} /t)$. Secondly, we study how the stochastic Frank-Wolfe algorithm can be applied to a few practical machine learning problems. Tight bounds on the gradient estimate errors for these examples are established. Numerical simulations support our findings.

[1] O. Klopp. Noisy low-rank matrix completion with general sampling distribution , 2012, 1203.0108.

[2] Elad Hazan,et al. Faster Rates for the Frank-Wolfe Method over Strongly-Convex Sets , 2014, ICML.

[3] Charles R. Johnson,et al. Topics in Matrix Analysis , 1991 .

[4] Martin Jaggi,et al. Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[5] Pablo A. Parrilo,et al. The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[6] L. Eon Bottou. Online Learning and Stochastic Approximations , 1998 .

[7] Emmanuel J. Candès,et al. Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[8] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[9] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[10] A. Juditsky,et al. 5 First-Order Methods for Nonsmooth Convex Large-Scale Optimization , I : General Purpose Methods , 2010 .

[11] L. Rosasco,et al. Convergence of Stochastic Proximal Gradient Algorithm , 2014, Applied Mathematics & Optimization.

[12] A. Juditsky. 6 First-Order Methods for Nonsmooth Convex Large-Scale Optimization , II : Utilizing Problem ’ s Structure , 2010 .

[13] Martin Jaggi,et al. An Affine Invariant Linear Convergence Analysis for Frank-Wolfe Algorithms , 2013, 1312.7864.

[14] Yang Cao,et al. Poisson Matrix Recovery and Completion , 2015, IEEE Transactions on Signal Processing.

[15] Soumyadip Ghosh,et al. Computing Worst-case Input Models in Stochastic Simulation , 2015 .

[16] Laurent El Ghaoui,et al. An Homotopy Algorithm for the Lasso with Online Observations , 2008, NIPS.

[17] V. Koltchinskii. A remark on low rank matrix recovery and noncommutative Bernstein type inequalities , 2013 .

[18] Martin J. Wainwright,et al. Restricted strong convexity and weighted matrix completion: Optimal bounds with noise , 2010, J. Mach. Learn. Res..

[19] Emmanuel J. Candès,et al. Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[20] Ting Sun,et al. Single-pixel imaging via compressive sampling , 2008, IEEE Signal Process. Mag..

[21] Alexandre B. Tsybakov,et al. Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[22] Léon Bottou,et al. On-line learning and stochastic approximations , 1999 .

[23] Christopher Ré,et al. Parallel stochastic gradient algorithms for large-scale matrix completion , 2013, Mathematical Programming Computation.

[24] Mark W. Schmidt,et al. Block-Coordinate Frank-Wolfe Optimization for Structural SVMs , 2012, ICML.

[25] Ewout van den Berg,et al. 1-Bit Matrix Completion , 2012, ArXiv.

[26] Paul Grigas,et al. New analysis and results for the Frank–Wolfe method , 2013, Mathematical Programming.

[27] Ohad Shamir,et al. Stochastic Convex Optimization , 2009, COLT.

[28] Philip Wolfe,et al. An algorithm for quadratic programming , 1956 .

[29] John Langford,et al. Sparse Online Learning via Truncated Gradient , 2008, NIPS.

[30] Elad Hazan,et al. Projection-free Online Learning , 2012, ICML.

[31] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.