Stochastic Matrix-Free Equilibration

We present a novel method for approximately equilibrating a matrix using only multiplication by the matrix and its transpose. Our method is based on convex optimization and projected stochastic gradient descent, using an unbiased estimate of a gradient obtained by a randomized method. Our method provably converges in expectation and empirically gets good results with a small number of iterations. We show how the method can be applied as a preconditioner for matrix-free iterative algorithms, substantially reducing the iterations required to reach a given level of precision. We also derive a novel connection between equilibration and condition number, showing that equilibration minimizes an upper bound on the condition number over all choices of row and column scalings.

[1]  D. Ruiz A Scaling Algorithm to Equilibrate Both Rows and Columns Norms in Matrices 1 , 2001 .

[2]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[3]  Stephen P. Boyd,et al.  CVXPY: A Python-Embedded Modeling Language for Convex Optimization , 2016, J. Mach. Learn. Res..

[4]  Bruce K. Bell,et al.  Volume 5 , 1998 .

[5]  Antonin Chambolle,et al.  Diagonal preconditioning for first order primal-dual algorithms in convex optimization , 2011, 2011 International Conference on Computer Vision.

[6]  S. Crawford,et al.  Volume 1 , 2012, Journal of Diabetes Investigation.

[7]  Stephen P. Boyd,et al.  Diagonal scaling in Douglas-Rachford splitting and ADMM , 2014, 53rd IEEE Conference on Decision and Control.

[8]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[9]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[10]  Stavros A. Zenios,et al.  A Comparative Study of Algorithms for Matrix Balancing , 1990, Oper. Res..

[11]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[12]  Leonidas J. Guibas,et al.  Multiple-target tracking and identity management , 2003, Proceedings of IEEE Sensors 2003 (IEEE Cat. No.03CH37498).

[13]  Y. Saad,et al.  An estimator for the diagonal of a matrix , 2007 .

[14]  Christopher Fougner,et al.  Parameter Selection and Preconditioning for a Graph Form Solver , 2015, 1503.08366.

[15]  Stephen P. Boyd,et al.  Metric selection in fast dual forward-backward splitting , 2015, Autom..

[16]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[17]  A. Sluis Condition numbers and equilibration of matrices , 1969 .

[18]  Mark W. Schmidt,et al.  A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method , 2012, ArXiv.

[19]  Philip A. Knight,et al.  The Sinkhorn-Knopp Algorithm: Convergence and Applications , 2008, SIAM J. Matrix Anal. Appl..

[20]  Stephen P. Boyd,et al.  Convex Optimization with Abstract Linear Operators , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Richard Sinkhorn,et al.  Concerning nonnegative matrices and doubly stochastic matrices , 1967 .

[22]  Daniel Cremers,et al.  An algorithm for minimizing the Mumford-Shah functional , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[23]  Stephen P. Boyd,et al.  Matrix-Free Convex Optimization Modeling , 2015, 1506.00760.

[24]  Stephen P. Boyd,et al.  Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding , 2013, Journal of Optimization Theory and Applications.

[25]  Sébastien Bubeck,et al.  Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[26]  Anne Greenbaum,et al.  Iterative methods for solving linear systems , 1997, Frontiers in applied mathematics.

[27]  A. Bradley Algorithms for the Equilibration of Matrices and Their Application to Limited-Memory Quasi-Newton Methods , 2010 .

[28]  A. Hoorfar,et al.  INEQUALITIES ON THE LAMBERTW FUNCTION AND HYPERPOWER FUNCTION , 2008 .

[29]  J. LaFountain Inc. , 2013, American Art.

[30]  M. Hutchinson A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines , 1989 .

[31]  H. Balakrishnan,et al.  Polynomial approximation algorithms for belief matrix maintenance in identity management , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[32]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[33]  C. Kelley Iterative Methods for Linear and Nonlinear Equations , 1987 .

[34]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[35]  Avraham Adler,et al.  Lambert-W Function , 2015 .

[36]  Hamid Javadi,et al.  Preconditioning via Diagonal Scaling , 2016, 1610.03871.

[37]  Michael A. Saunders,et al.  LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares , 1982, TOMS.

[38]  Danny C. Sorensen,et al.  Deflation Techniques for an Implicitly Restarted Arnoldi Iteration , 1996, SIAM J. Matrix Anal. Appl..