A linearly convergent doubly stochastic Gauss–Seidel algorithm for solving linear equations and a certain class of over-parameterized optimization problems

Consider the classical problem of solving a general linear system of equations $$Ax=b$$Ax=b. It is well known that the (successively over relaxed) Gauss–Seidel scheme and many of its variants may not converge when A is neither diagonally dominant nor symmetric positive definite. Can we have a linearly convergent G–S type algorithm that works for anyA? In this paper we answer this question affirmatively by proposing a doubly stochastic G–S algorithm that is provably linearly convergent (in the mean square error sense) for any feasible linear system of equations. The key in the algorithm design is to introduce a nonuniform double stochastic scheme for picking the equation and the variable in each update step as well as a stepsize rule. These techniques also generalize to certain iterative alternating projection algorithms for solving the linear feasibility problem $$A x\le b$$Ax≤b with an arbitrary A, as well as high-dimensional minimization problems for training over-parameterized models in machine learning. Our results demonstrate that a carefully designed randomization scheme can make an otherwise divergent G–S algorithm converge.

[1]  Yurii Nesterov,et al.  Linear convergence of first order methods for non-strongly convex optimization , 2015, Math. Program..

[2]  Stephen J. Wright,et al.  An accelerated randomized Kaczmarz algorithm , 2013, Math. Comput..

[3]  Peter Richtárik,et al.  Stochastic Dual Ascent for Solving Linear Systems , 2015, ArXiv.

[4]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[5]  Karin Schwab,et al.  Best Approximation In Inner Product Spaces , 2016 .

[6]  Weiqi Zhou,et al.  Random reordering in SOR-type methods , 2017, Numerische Mathematik.

[7]  S. Kaczmarz Approximate solution of systems of linear equations , 1993 .

[8]  Tong Zhang,et al.  Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[9]  Adrian S. Lewis,et al.  Randomized Methods for Linear Constraints: Convergence Rates and Conditioning , 2008, Math. Oper. Res..

[10]  Deanna Needell,et al.  Paved with Good Intentions: Analysis of a Randomized Block Kaczmarz Method , 2012, ArXiv.

[11]  Tianbao Yang,et al.  Doubly Stochastic Primal-Dual Coordinate Method for Bilinear Saddle-Point Problem , 2015, 1508.03390.

[12]  Nikolaos M. Freris,et al.  Randomized Extended Kaczmarz for Solving Least Squares , 2012, SIAM J. Matrix Anal. Appl..

[13]  Deanna Needell,et al.  Convergence Properties of the Randomized Extended Gauss-Seidel and Kaczmarz Methods , 2015, SIAM J. Matrix Anal. Appl..

[14]  D. Needell Randomized Kaczmarz solver for noisy linear systems , 2009, 0902.0958.

[15]  J. Ortega Numerical Analysis: A Second Course , 1974 .

[16]  Stephen J. Wright,et al.  An asynchronous parallel stochastic coordinate descent algorithm , 2013, J. Mach. Learn. Res..

[17]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[18]  Yousef Saad,et al.  Iterative methods for sparse linear systems , 2003 .

[19]  Deanna Needell,et al.  Rows vs. Columns: Randomized Kaczmarz or Gauss-Seidel for Ridge Regression , 2015, 1507.05844.

[20]  Gene H. Golub,et al.  Matrix computations , 1983 .

[21]  Jesús A. De Loera,et al.  A Sampling Kaczmarz-Motzkin Algorithm for Linear Feasibility , 2016, SIAM J. Sci. Comput..

[22]  Mark W. Schmidt,et al.  Ju n 20 18 Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Lojasiewicz Condition , 2018 .

[23]  Quanquan Gu,et al.  Accelerated Stochastic Block Coordinate Descent with Optimal Sampling , 2016, KDD.

[24]  Peter Richtárik,et al.  Semi-stochastic coordinate descent , 2014, Optim. Methods Softw..

[25]  R. Vershynin,et al.  A Randomized Kaczmarz Algorithm with Exponential Convergence , 2007, math/0702226.

[26]  P. Oswald,et al.  Convergence analysis for Kaczmarz-type methods in a Hilbert space framework , 2015 .

[27]  S. Agmon The Relaxation Method for Linear Inequalities , 1954, Canadian Journal of Mathematics.

[28]  Samuel Awoniyi,et al.  A new iterative method for solving non-square systems of linear equations , 2017, J. Comput. Appl. Math..