论文信息 - On Matching Pursuit and Coordinate Descent

On Matching Pursuit and Coordinate Descent

Two popular examples of first-order optimization methods over linear spaces are coordinate descent and matching pursuit algorithms, with their randomized variants. While the former targets the optimization by moving along coordinates, the latter considers a generalized notion of directions. Exploiting the connection between the two algorithms, we present a unified analysis of both, providing affine invariant sublinear $\mathcal{O}(1/t)$ rates on smooth objectives and linear convergence on strongly convex objectives. As a byproduct of our affine invariant analysis of matching pursuit, our rates for steepest coordinate descent are the tightest known. Furthermore, we show the first accelerated convergence rate $\mathcal{O}(1/t^2)$ for matching pursuit and steepest coordinate descent on convex objectives.

[1] D. B. Goodner. Projections in normed linear spaces , 1950 .

[2] Philip Wolfe,et al. An algorithm for quadratic programming , 1956 .

[3] Robert Hooke,et al. `` Direct Search'' Solution of Numerical and Statistical Problems , 1961, JACM.

[4] V. G. Karmanov. Convergence estimates for iterative minimization methods , 1974 .

[5] Charles A. Holloway. An extension of the frank and Wolfe method of feasible directions , 1974, Math. Program..

[6] V. G. Karmanov. On Convergence of a Random Search Method in Convex Minimization Problems , 1975 .

[7] Ryszard Zieliński,et al. Stochastische Verfahren zur Suche nach dem Minimum einer Funktion , 1983 .

[8] Sheng Chen,et al. Orthogonal least squares methods and their application to non-linear system identification , 1989 .

[9] J. Dennis,et al. Direct Search Methods on Parallel Machines , 1991 .

[10] Stéphane Mallat,et al. Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[11] Virginia Torczon,et al. On the Convergence of Pattern Search Algorithms , 1997, SIAM J. Optim..

[12] Gunnar Rätsch,et al. On the Convergence of Leveraging , 2001, NIPS.

[13] Alexander J. Smola,et al. Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[14] Gunnar Rätsch,et al. An Introduction to Boosting and Leveraging , 2002, Machine Learning Summer School.

[15] Joel A. Tropp,et al. Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[16] Pierre Vandergheynst,et al. On the exponential convergence of matching pursuits in quasi-incoherent dictionaries , 2006, IEEE Transactions on Information Theory.

[17] Michael B. Wakin,et al. Analysis of Orthogonal Matching Pursuit Using the Restricted Isometry Property , 2009, IEEE Transactions on Information Theory.

[18] Yurii Nesterov,et al. Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..

[19] Pablo A. Parrilo,et al. The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[20] Martin Jaggi,et al. An Optimal Affine Invariant Smooth Minimization Algorithm , 2013, 1301.0465.

[21] Christian L. Müller,et al. Optimization of Convex Functions with Random Pursuit , 2011, SIAM J. Optim..

[22] Yin Tat Lee,et al. Efficient Accelerated Coordinate Descent Methods and Faster Algorithms for Solving Linear Systems , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[23] V. Temlyakov. Chebushev Greedy Algorithm in convex optimization , 2013, 1312.1244.

[24] Martin Jaggi,et al. Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[25] Mark W. Schmidt,et al. Block-Coordinate Frank-Wolfe Optimization for Structural SVMs , 2012, ICML.

[26] Vladimir N. Temlyakov,et al. Greedy algorithms in convex optimization on Banach spaces , 2014, 2014 48th Asilomar Conference on Signals, Systems and Computers.

[27] Zeyuan Allen-Zhu,et al. Linear Coupling of Gradient and Mirror Descent: A Novel, Simple Interpretation of Nesterov's Accelerated Method , 2014 .

[28] Hao Nguyen,et al. Greedy Strategies for Convex Optimization , 2014, 1401.1754.

[29] Sebastian U. Stich,et al. Convex Optimization with Random Pursuit , 2014 .

[30] Nicolas Gillis,et al. Hierarchical Clustering of Hyperspectral Images Using Rank-Two Nonnegative Matrix Factorization , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[31] Mark W. Schmidt,et al. Coordinate Descent Converges Faster with the Gauss-Southwell Rule Than Random Selection , 2015, ICML.

[32] Martin Jaggi,et al. On the Global Linear Convergence of Frank-Wolfe Optimization Variants , 2015, NIPS.

[33] Ryan J. Tibshirani,et al. A general framework for fast stagewise algorithms , 2014, J. Mach. Learn. Res..

[34] Zeyuan Allen Zhu,et al. Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling , 2015, ICML.

[35] Martin Jaggi,et al. Approximate Steepest Coordinate Descent , 2017, ICML.

[36] Yong Jiang,et al. Accelerated Stochastic Greedy Coordinate Descent by Soft Thresholding Projection onto Simplex , 2017, NIPS.

[37] Yurii Nesterov,et al. Efficiency of the Accelerated Coordinate Descent Method on Structured Optimization Problems , 2017, SIAM J. Optim..

[38] Gunnar Rätsch,et al. Greedy Algorithms for Cone Constrained Optimization with Convergence Guarantees , 2017, NIPS.

[39] Martin Jaggi,et al. A Unified Optimization View on Generalized Matching Pursuit and Frank-Wolfe , 2017, AISTATS.

[40] Nicolas Gillis,et al. A Fast Gradient Method for Nonnegative Sparse Regression With Self-Dictionary , 2016, IEEE Transactions on Image Processing.

[41] Alexandre d'Aspremont,et al. Optimal Affine-Invariant Smooth Minimization Algorithms , 2018, SIAM J. Optim..

[42] Vahab S. Mirrokni,et al. Accelerating Greedy Coordinate Descent Methods , 2018, ICML.

[43] Javier Peña,et al. Polytope Conditioning and Linear Convergence of the Frank-Wolfe Algorithm , 2015, Math. Oper. Res..

[44] Martin Jaggi,et al. Efficient Greedy Coordinate Descent for Composite Problems , 2019, AISTATS.