Much Faster Algorithms for Matrix Scaling

We develop several efficient algorithms for the classical Matrix Scaling} problem, which is used in many diverse areas, from preconditioning linear systems to approximation of the permanent. On an input n× n matrix A, this problem asks to find diagonal (scaling) matrices X and Y (if they exist), so that X A Y ε-approximates a doubly stochastic matrix, or more generally a matrix with prescribed row and column sums.We address the general scaling problem as well as some important special cases. In particular, if A has m nonzero entries, and if there exist X and Y with polynomially large entries such that X A Y is doubly stochastic, then we can solve the problem in total complexity \tilde{O}(m + n^{4/3}). This greatly improves on the best known previous results, which were either \tilde{O}(n^4) or O(m n^{1/2}/ε).Our algorithms are based on tailor-made first and second order techniques, combined with other recent advances in continuous optimization, which may be of independent interest for solving similar problems.

[1]  W. Deming,et al.  On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals are Known , 1940 .

[2]  David T. Brown,et al.  A Note on Approximations to Discrete Probability Distributions , 1959, Inf. Control..

[3]  James Hardy Wilkinson,et al.  Rounding errors in algebraic processes , 1964, IFIP Congress.

[4]  D. Friedlander A Technique for Estimating a Contingency Table, Given the Marginal Totals and Some Supplementary Data , 1961 .

[5]  Richard Sinkhorn A Relationship Between Arbitrary Positive Matrices and Doubly Stochastic Matrices , 1964 .

[6]  I. Olkin,et al.  Scaling of matrices to achieve specified row and column sums , 1968 .

[7]  A. Lent,et al.  Iterative reconstruction algorithms. , 1976, Computers in biology and medicine.

[8]  S. Macgill Theoretical Properties of Biproportional Matrix Adjustments , 1977 .

[9]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[10]  T. Raghavan,et al.  On pairs of multidimensional matrices , 1984 .

[11]  Aharon Ben-Tal,et al.  Lectures on modern convex optimization , 1987 .

[12]  U. Rothblum,et al.  Scalings of matrices which have prespecified row sums and column sums via optimization , 1989 .

[13]  Leonid Khachiyan,et al.  On the rate of convergence of deterministic and randomized RAS matrix scaling algorithms , 1993, Oper. Res. Lett..

[14]  L. Khachiyan,et al.  ON THE COMPLEXITY OF NONNEGATIVE-MATRIX SCALING , 1996 .

[15]  Alex Samorodnitsky,et al.  A Deterministic Strongly Polynomial Algorithm for Matrix Scaling and Approximate Permanents , 1998, STOC '98.

[16]  Alex Samorodnitsky,et al.  A Deterministic Algorithm for Approximating the Mixed Discriminant and Mixed Volume, and a Combinatorial Corollary , 2002, Discret. Comput. Geom..

[17]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[18]  H. Balakrishnan,et al.  Polynomial approximation algorithms for belief matrix maintenance in identity management , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[19]  Shang-Hua Teng,et al.  Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems , 2003, STOC '04.

[20]  Gene H. Golub,et al.  Scaling by Binormalization , 2004, Numerical Algorithms.

[21]  Günter Rote,et al.  Matrix scaling by network flow , 2007, SODA '07.

[22]  Bahman Kalantari,et al.  On the complexity of general matrix scaling and entropy minimization via the RAS algorithm , 2007, Math. Program..

[23]  Alexander Rakhlin,et al.  Lecture Notes on Online Learning DRAFT , 2009 .

[24]  A. Bradley Algorithms for the Equilibration of Matrices and Their Application to Limited-Memory Quasi-Newton Methods , 2010 .

[25]  Antonin Chambolle,et al.  Diagonal preconditioning for first order primal-dual algorithms in convex optimization , 2011, 2011 International Conference on Computer Vision.

[26]  Zeyuan Allen Zhu,et al.  A simple, combinatorial algorithm for solving SDD systems in nearly-linear time , 2013, STOC '13.

[27]  Stephen P. Boyd,et al.  A Primal-Dual Operator Splitting Method for Conic Optimization , 2013 .

[28]  Richard Peng,et al.  An efficient parallel solver for SDD linear systems , 2013, STOC.

[29]  Zeyuan Allen Zhu,et al.  Nearly-Linear Time Positive LP Solver with Faster Convergence Rate , 2015, STOC.

[30]  Zeyuan Allen Zhu,et al.  Using Optimization to Break the Epsilon Barrier: A Faster and Simpler Width-Independent Algorithm for Solving Positive Linear Programs in Parallel , 2014, SODA.

[31]  Richard Peng,et al.  Sparsified Cholesky and multigrid solvers for connection laplacians , 2015, STOC.

[32]  Hamid Javadi,et al.  Preconditioning via Diagonal Scaling , 2016, 1610.03871.

[33]  Martin Idel A review of matrix scaling and Sinkhorn's normal form for matrices and positive maps , 2016, 1609.06349.

[34]  Yin Tat Lee,et al.  Using Optimization to Obtain a Width-Independent, Parallel, Simpler, and Faster Positive SDP Solver , 2015, SODA.

[35]  Sushant Sachdeva,et al.  Approximate Gaussian Elimination for Laplacians - Fast, Sparse, and Simple , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[36]  Zeyuan Allen Zhu,et al.  Linear Coupling: An Ultimate Unification of Gradient and Mirror Descent , 2014, ITCS.