Linear Convergence of Randomized Feasible Descent Methods Under the Weak Strong Convexity Assumption

In this paper we generalize the framework of the feasible descent method (FDM) to a randomized (R-FDM) and a coordinate-wise random feasible descent method (RC-FDM) framework. We show that the famous SDCA algorithm for optimizing the SVM dual problem, or the stochastic coordinate descent method for the LASSO problem, fits into the framework of RC-FDM. We prove linear convergence for both R-FDM and RC-FDM under the weak strong convexity assumption. Moreover, we show that the duality gap converges linearly for RC-FDM, which implies that the duality gap also converges linearly for SDCA applied to the SVM dual problem.

[1]  A. Hoffman On approximate solutions of systems of linear inequalities , 1952 .

[2]  S. M. Robinson Bounds for error in the solution set of a perturbed linear program , 1973 .

[3]  Z.-Q. Luo,et al.  Error bounds and convergence analysis of feasible descent methods: a general approach , 1993, Ann. Oper. Res..

[4]  Wu Li The sharp Lipschitz constants for feasible and optimal solutions of a perturbed linear program , 1993 .

[5]  Chih-Jen Lin,et al.  A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[6]  Chih-Jen Lin,et al.  Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines , 2008, J. Mach. Learn. Res..

[7]  S. Bonettini Inexact block coordinate descent methods with application to non-negative matrix factorization , 2011 .

[8]  Shai Shalev-Shwartz,et al.  Stochastic dual coordinate ascent methods for regularized loss , 2012, J. Mach. Learn. Res..

[9]  Xiaoqin Hua,et al.  An Inexact Coordinate Descent Method for the Weighted l₁-regularized Convex Optimization Problem (最適化手法の理論と応用の繋がり : RIMS研究集会報告集) , 2013 .

[10]  Hui Zhang,et al.  Gradient methods for convex minimization: better rates under weaker conditions , 2013, ArXiv.

[11]  Dragos N. Clipici,et al.  Parallel coordinate descent methods for composite minimization , 2013 .

[12]  Yurii Nesterov,et al.  Gradient methods for minimizing composite functions , 2012, Mathematical Programming.

[13]  Avleen Singh Bijral,et al.  Mini-Batch Primal and Dual Methods for SVMs , 2013, ICML.

[14]  I. Necoara,et al.  Distributed dual gradient methods and error bound conditions , 2014, 1401.4398.

[15]  Chih-Jen Lin,et al.  Iteration complexity of feasible descent methods for convex optimization , 2014, J. Mach. Learn. Res..

[16]  Peter Richtárik,et al.  Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function , 2011, Mathematical Programming.

[17]  Ion Necoara,et al.  Rate Analysis of Inexact Dual First-Order Methods Application to Dual Decomposition , 2014, IEEE Transactions on Automatic Control.

[18]  Yurii Nesterov,et al.  First-order methods of smooth convex optimization with inexact oracle , 2013, Mathematical Programming.

[19]  Lin Xiao,et al.  On the complexity analysis of randomized block-coordinate descent methods , 2013, Mathematical Programming.

[20]  I. Necoara Linear convergence of first order methods under weak nondegeneracy assumptions for convex programming , 2015 .

[21]  Peter Richtárik,et al.  Distributed Mini-Batch SDCA , 2015, ArXiv.

[22]  Stephen J. Wright,et al.  Asynchronous Stochastic Coordinate Descent: Parallelism and Convergence Properties , 2014, SIAM J. Optim..

[23]  Peter Richtárik,et al.  Inexact Coordinate Descent: Complexity and Preconditioning , 2013, J. Optim. Theory Appl..

[24]  Peter Richtárik,et al.  On the complexity of parallel coordinate descent , 2015, Optim. Methods Softw..

[25]  Wu Li The Sharp Lipschitz-Constants for Feasible and Optimal-Solutions of a Perturbed Linear Program , 2018 .