Stronger and Faster Approximate Singular Value Decomposition via the Block Lanczos Method

Since being analyzed by Rokhlin, Szlam, and Tygert [1] and po pularized by Halko, Martinsson, and Tropp [2], randomized Simultaneous Power Iteration has become the method of choice for approximate singular value d ecomposition. It is more accurate than simpler sketching algorithms, yet still converges quickly for anymatrix, independently of singular value gaps. After Õ(1/ǫ) iterations, it gives a low-rank approximation within(1 + ǫ) of optimal for spectral norm error. We give the first provable runtime improvement on Simultaneo us Iteration – a simple randomized variant of the classic Block Lanczos meth od gives the same guarantees in just̃ O(1/ √ ǫ) iterations and performs substantially better experimentally. Despite their long history, our analysis is the fir st of a Krylov subspace method like Block Lanczos that does not depend on singular va lue gaps, which are unreliable in practice. Furthermore, while it is a simple accuracy benchmark, even (1 + ǫ) error for spectral norm low rank approximation does not imply that an a lgorithm returns high quality principal components, a major issue for data ap plications. We address this problem for the first time by showing that both Block Lanc zos and a minor modification of Simultaneous Iteration give nearly optimal PCA for any matrix. This result further justifies their strength over non-itera ive sketching methods. Finally, we give insight beyond the worst case, justifying w hy both algorithms can run much faster in practice than predicted. We clarify how si mple techniques can take advantage of common matrix properties to significantly improve runtime.

[1]  C. Lanczos An iteration method for the solution of the eigenvalue problem of linear differential and integral operators , 1950 .

[2]  F. L. Bauer Das Verfahren der Treppeniteration und verwandte Verfahren zur Lösung algebraischer Eigenwertprobleme , 1957 .

[3]  L. Mirsky SYMMETRIC GAUGE FUNCTIONS AND UNITARILY INVARIANT NORMS , 1960 .

[4]  H. Rutishauser Simultaneous iteration method for symmetric matrices , 1970 .

[5]  J. Cullum,et al.  A block Lanczos algorithm for computing the q algebraically largest eigenvalues and a corresponding eigenspace of large, sparse, real symmetric matrices , 1974, CDC 1974.

[6]  Y. Saad On the Rates of Convergence of the Lanczos and the Block-Lanczos Methods , 1980 .

[7]  Franklin T. Luk,et al.  A Block Lanczos Method for Computing the Singular Values and Corresponding Singular Vectors of a Matrix , 1981, TOMS.

[8]  Gene H. Golub,et al.  Matrix computations , 1983 .

[9]  Ming Gu,et al.  Efficient Algorithms for Computing a Strong Rank-Revealing QR Factorization , 1996, SIAM J. Sci. Comput..

[10]  Santosh S. Vempala,et al.  Latent semantic indexing: a probabilistic analysis , 1998, PODS '98.

[11]  J. Mason,et al.  Integration Using Chebyshev Polynomials , 2003 .

[12]  Alan M. Frieze,et al.  Fast monte-carlo algorithms for finding low-rank approximations , 2004, JACM.

[13]  Alan M. Frieze,et al.  Clustering Large Graphs via the Singular Value Decomposition , 2004, Machine Learning.

[14]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[15]  V. Rokhlin,et al.  A randomized algorithm for the approximation of matrices , 2006 .

[16]  Tamás Sarlós,et al.  Improved Approximation Algorithms for Large Matrices via Random Projections , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[17]  Petros Drineas,et al.  Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix , 2006, SIAM J. Comput..

[18]  Santosh S. Vempala,et al.  Adaptive Sampling and Fast Low-Rank Matrix Approximation , 2006, APPROX-RANDOM.

[19]  Gene H. Golub,et al.  The block Lanczos method for computing eigenvalues , 2007, Milestones in Matrix Computation.

[20]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[21]  Mark Tygert,et al.  A Randomized Algorithm for Principal Component Analysis , 2008, SIAM J. Matrix Anal. Appl..

[22]  M. Rudelson,et al.  Non-asymptotic theory of random matrices: extreme singular values , 2010, 1003.2990.

[23]  A. Rantzer,et al.  On the Minimum Rank of a Generalized Matrix Approximation Problem in the Maximum Singular Value Norm , 2010 .

[24]  Nathan Halko,et al.  An Algorithm for the Principal Component Analysis of Large Data Sets , 2010, SIAM J. Sci. Comput..

[25]  Philipp Birken,et al.  Numerical Linear Algebra , 2011, Encyclopedia of Parallel Computing.

[26]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[27]  Christos Boutsidis,et al.  Near Optimal Column-Based Matrix Reconstruction , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[28]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[29]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[30]  David P. Woodruff,et al.  Low rank approximation and regression in input sparsity time , 2013, STOC '13.

[31]  Huy L. Nguyen,et al.  OSNAP: Faster Numerical Linear Algebra Algorithms via Sparser Subspace Embeddings , 2012, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[32]  Michael W. Mahoney,et al.  Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression , 2012, STOC '13.

[33]  Mark Tygert,et al.  An implementation of a randomized algorithm for principal component analysis , 2014, ArXiv.

[34]  Christos Boutsidis,et al.  Near-Optimal Column-Based Matrix Reconstruction , 2014, SIAM J. Comput..

[35]  David P. Woodruff Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..

[36]  Emmanuel J. Candès,et al.  Randomized Algorithms for Low-Rank Matrix Factorizations: Sharp Performance Bounds , 2013, Algorithmica.

[37]  Michael B. Cohen,et al.  Dimensionality Reduction for k-Means Clustering and Low Rank Approximation , 2014, STOC.

[38]  Ming Gu,et al.  Subspace Iteration Randomization and Singular Value Problems , 2014, SIAM J. Sci. Comput..

[39]  Zohar S. Karnin,et al.  Online {PCA} with Spectral Bounds , 2015 .

[40]  Aggelos K. Katsaggelos,et al.  Methods for large scale machine learning , 2016 .