论文信息 - Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the $k$ dominant components of the singular value decomposition of an $m \times n$ matrix. (i) For a dense input matrix, randomized algorithms require $\bigO(mn \log(k))$ floating-point operations (flops) in contrast to $ \bigO(mnk)$ for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to $\bigO(k)$ passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data.

[1] C. Eckart,et al. The approximation of one matrix by another of lower rank , 1936 .

[2] J. Neumann,et al. Numerical inverting of matrices of high order , 1947 .

[3] William Feller,et al. An Introduction to Probability Theory and Its Applications , 1951 .

[4] J. Neumann,et al. Numerical inverting of matrices of high order. II , 1951 .

[5] L. Mirsky. SYMMETRIC GAUGE FUNCTIONS AND UNITARILY INVARIANT NORMS , 1960 .

[6] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .

[7] William Feller,et al. An Introduction to Probability Theory and Its Applications , 1967 .

[8] S. M. Samuels,et al. Monotone Convergence of Binomial Probabilities and a Generalization of Ramanujan's Equation , 1968 .

[9] G. Stewart. Accelerating the orthogonal iteration for the eigenvectors of a Hermitian matrix , 1969 .

[10] G. Stewart. On the Perturbation of Pseudo-Inverses, Projections and Linear Least Squares Problems , 1977 .

[11] Karel Hrbacek,et al. A New Proof that π , 1979, Math. Log. Q..

[12] R. Muirhead. Aspects of Multivariate Statistical Theory , 1982, Wiley Series in Probability and Statistics.

[13] Gene H. Golub,et al. Matrix computations , 1983 .

[14] J. Dixon. Estimating Extremal Eigenvalues and Condition Numbers of Matrices , 1983 .

[15] W. B. Johnson,et al. Extensions of Lipschitz mappings into Hilbert space , 1984 .

[16] B. Carl. Inequalities of Bernstein-Jackson-type and the degree of compactness of operators in Banach spaces , 1985 .

[17] Charles R. Johnson,et al. Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[18] Y. Gordon. Some inequalities for Gaussian processes and applications , 1985 .

[19] J. Bourgain. On lipschitz embedding of finite metric spaces in Hilbert space , 1985 .

[20] William H. Press,et al. Numerical recipes in C. The art of scientific computing , 1987 .

[21] L Sirovich,et al. Low-dimensional procedure for the characterization of human faces. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[22] Y. Gordon. Gaussian Processes and Almost Spherical Sections of Convex Bodies , 1988 .

[23] A. Edelman. Eigenvalues and condition numbers of random matrices , 1988 .

[24] J. E. Glynn,et al. Numerical Recipes: The Art of Scientific Computing , 1989 .

[25] S. Szarek. Spaces with large distance to l∞n and random matrices , 1990 .

[26] M. Talagrand,et al. Probability in Banach Spaces: Isoperimetry and Processes , 1991 .

[27] Henryk Wozniakowski,et al. Estimating the Largest Eigenvalue by the Power and Lanczos Algorithms with a Random Start , 1992, SIAM J. Matrix Anal. Appl..

[28] J. Kuczy,et al. Estimating the Largest Eigenvalue by the Power and Lanczos Algorithms with a Random Start , 1992 .

[29] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[30] David R. Karger,et al. Random sampling in cut, flow, and network design problems , 1994, STOC '94.

[31] D. S. Parker,et al. The randomizing FFT : an alternative to pivoting in GaussianeliminationD , 1995 .

[32] Noga Alon,et al. The space complexity of approximating the frequency moments , 1996, STOC '96.

[33] Ming Gu,et al. Efficient Algorithms for Computing a Strong Rank-Revealing QR Factorization , 1996, SIAM J. Sci. Comput..

[34] J. Navarro-Pedreño. Numerical Methods for Least Squares Problems , 1996 .

[35] Peter L. Bartlett,et al. Efficient agnostic learning of neural networks with bounded fan-in , 1996, IEEE Trans. Inf. Theory.

[36] Åke Björck,et al. Numerical methods for least square problems , 1996 .

[37] Gene H. Golub,et al. Matrix computations (3rd ed.) , 1996 .

[38] R. Bhatia. Matrix Analysis , 1996 .

[39] M. Rudelson. Random Vectors in the Isotropic Position , 1996, math/9608208.

[40] Hyeonjoon Moon,et al. The FERET evaluation methodology for face-recognition algorithms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[41] M. Ledoux. On Talagrand's deviation inequalities for product measures , 1997 .

[42] Jon M. Kleinberg,et al. Two algorithms for nearest-neighbor search in high dimensions , 1997, STOC '97.

[43] L. Greengard,et al. A new version of the Fast Multipole Method for the Laplace equation in three dimensions , 1997, Acta Numerica.

[44] L. Trefethen,et al. Numerical linear algebra , 1997 .

[45] S. Goreinov,et al. A Theory of Pseudoskeleton Approximations , 1997 .

[46] Sam T. Roweis,et al. EM Algorithms for PCA and SPCA , 1997, NIPS.

[47] Henryk Wozniakowski,et al. Estimating a largest eigenvector by Lanczos and polynomial algorithms with a random start , 1998, Numer. Linear Algebra Appl..

[48] Piotr Indyk,et al. Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[49] Martin Vetterli,et al. Data Compression and Harmonic Analysis , 1998, IEEE Trans. Inf. Theory.

[50] H. Wozniakowski,et al. Estimating a largest eigenvector by Lanczos and polynomial algorithms with a random start , 1998 .

[51] Santosh S. Vempala,et al. Latent semantic indexing: a probabilistic analysis , 1998, PODS '98.

[52] Rafail Ostrovsky,et al. Efficient search for approximate nearest neighbor in high dimensional spaces , 1998, STOC '98.

[53] Harry Wechsler,et al. The FERET database and evaluation procedure for face-recognition algorithms , 1998, Image Vis. Comput..

[54] Anupam Gupta,et al. An elementary proof of the Johnson-Lindenstrauss Lemma , 1999 .

[55] Alan M. Frieze,et al. Clustering in large graphs and matrices , 1999, SODA '99.

[56] G. W. Stewart,et al. Four algorithms for the the efficient computation of truncated pivoted QR approximations to a sparse matrix , 1999, Numerische Mathematik.

[57] Douglas Stott Parker,et al. Using randomization to make recursive matrix algorithms practical , 1999, J. Funct. Program..

[58] David R. Karger,et al. Random Sampling in Cut, Flow, and Network Design Problems , 1999, Math. Oper. Res..

[59] Noga Alon,et al. Tracking join and self-join sizes in limited storage , 1999, PODS '99.

[60] Jack J. Dongarra,et al. Guest Editors Introduction to the top 10 algorithms , 2000, Comput. Sci. Eng..

[61] G. W. Stewart,et al. The decompositional approach to matrix computation , 2000, Comput. Sci. Eng..

[62] Russ Bubley,et al. Randomized algorithms , 1995, CSUR.

[63] David R. Karger,et al. Minimum cuts in near-linear time , 1998, JACM.

[64] C. Pan. On the existence and computation of rank-revealing LU factorizations , 2000 .

[65] Francis Sullivan,et al. The Metropolis Algorithm , 2000, Computing in Science & Engineering.

[66] Klaus Jansen,et al. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques , 2012, Lecture Notes in Computer Science.

[67] A. Buchholz. Operator Khintchine inequality in non-commutative probability , 2001 .

[68] M. Ledoux. The concentration of measure phenomenon , 2001 .

[69] Trevor Hastie,et al. The Elements of Statistical Learning , 2001 .

[70] Dimitris Achlioptas,et al. Fast computation of low rank matrix approximations , 2001, STOC '01.

[71] S. Szarek,et al. Chapter 8 - Local Operator Theory, Random Matrices and Banach Spaces , 2001 .

[72] B. Engquist,et al. Wavelet-Based Numerical Homogenization with Applications , 2002 .

[73] Jiri Matousek,et al. Lectures on discrete geometry , 2002, Graduate texts in mathematics.

[74] S. Muthukrishnan,et al. Data streams: algorithms and applications , 2005, SODA '03.

[75] Wolfgang Hackbusch,et al. Construction and Arithmetics of H-Matrices , 2003, Computing.

[76] Dimitris Achlioptas,et al. Database-friendly random projections: Johnson-Lindenstrauss with binary coins , 2003, J. Comput. Syst. Sci..

[77] Alan M. Frieze,et al. Fast monte-carlo algorithms for finding low-rank approximations , 2004, JACM.

[78] Alan M. Frieze,et al. Clustering Large Graphs via the Singular Value Decomposition , 2004, Machine Learning.

[79] D. Ruppert. The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[80] Hans C. van Houwelingen,et al. The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Trevor Hastie, Robert Tibshirani and Jerome Friedman, Springer, New York, 2001. No. of pages: xvi+533. ISBN 0‐387‐95284‐5 , 2004 .

[81] Anna R. Karlin,et al. Spectral methods for data analysis , 2004 .

[82] Ann B. Lee,et al. Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[83] Petros Drineas,et al. On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[84] Per-Gunnar Martinsson,et al. On the Compression of Low Rank Matrices , 2005, SIAM J. Sci. Comput..

[85] Zizhong Chen,et al. Condition Numbers of Gaussian Random Matrices , 2005, SIAM J. Matrix Anal. Appl..

[86] A. Buchholz. Optimal Constants in Khintchine Type Inequalities for Fermions, Rademachers and q-Gaussian Operators , 2005 .

[87] K. Clarkson. Subgradient and sampling algorithms for l1 regression , 2005, SODA '05.

[88] Santosh S. Vempala,et al. Matrix approximation and projective clustering via volume sampling , 2006, SODA '06.

[89] V. Rokhlin,et al. A randomized algorithm for the approximation of matrices , 2006 .

[90] P. Atzberger. The Monte-Carlo Method , 2006 .

[91] Tamás Sarlós,et al. Improved Approximation Algorithms for Large Matrices via Random Projections , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[92] Emmanuel J. Candès,et al. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[93] E.J. Candes. Compressive Sampling , 2022 .

[94] Petros Drineas,et al. FAST MONTE CARLO ALGORITHMS FOR MATRICES II: COMPUTING A LOW-RANK APPROXIMATION TO A MATRIX∗ , 2004 .

[95] Petros Drineas,et al. Fast Monte Carlo Algorithms for Matrices III: Computing a Compressed Approximate Matrix Decomposition , 2006, SIAM J. Comput..

[96] S. Muthukrishnan,et al. Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods , 2006, APPROX-RANDOM.

[97] Emmanuel J. Candès,et al. Quantitative Robust Uncertainty Principles and Optimally Sparse Decompositions , 2004, Found. Comput. Math..

[98] Sanjeev Arora,et al. A Fast Random Sampling Algorithm for Sparsifying Matrices , 2006, APPROX-RANDOM.

[99] Bernard Chazelle,et al. Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform , 2006, STOC '06.

[100] M. Rudelson,et al. Sparse reconstruction by convex relaxation: Fourier and Gaussian measurements , 2006, 2006 40th Annual Conference on Information Sciences and Systems.

[101] Petros Drineas,et al. Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication , 2006, SIAM J. Comput..

[102] Santosh S. Vempala,et al. Adaptive Sampling and Fast Low-Rank Matrix Approximation , 2006, APPROX-RANDOM.

[103] Kasturi R. Varadarajan,et al. Efficient Subspace Approximation Algorithms , 2007, Discrete & Computational Geometry.

[104] Mark Rudelson,et al. Sampling from large matrices: An approach through geometric functional analysis , 2005, JACM.

[105] Jimeng Sun,et al. Less is More: Compact Matrix Decomposition for Large Sparse Graphs , 2007, SDM.

[106] Per-Gunnar Martinsson,et al. Randomized algorithms for the low-rank approximation of matrices , 2007, Proceedings of the National Academy of Sciences.

[107] Michael W. Mahoney,et al. A randomized algorithm for a tensor-based generalization of the singular value decomposition , 2007 .

[108] E. Candès,et al. Sparsity and incoherence in compressive sampling , 2006, math/0611957.

[109] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[110] R. Vershynin,et al. A Randomized Kaczmarz Algorithm with Exponential Convergence , 2007, math/0702226.

[111] M. Rozložník. Numerics of Gram-Schmidt orthogonalization , 2007 .

[112] V. Rokhlin,et al. A fast randomized algorithm for the approximation of matrices ✩ , 2007 .

[113] James Demmel,et al. Fast linear algebra is stable , 2006, Numerische Mathematik.

[114] V. Bogachev. Gaussian Measures on a , 2022 .

[115] Ronald R. Coifman,et al. Regularization on Graphs with Function-adapted Diffusion Processes , 2008, J. Mach. Learn. Res..

[116] Anirban Dasgupta,et al. Sampling algorithms and coresets for ℓp regression , 2007, SODA '08.

[117] Alexandre d'Aspremont,et al. Subsampling algorithms for semidefinite programming , 2008, 0803.1990.

[118] Christos Boutsidis,et al. Unsupervised feature selection for principal components analysis , 2008, KDD.

[119] Xilin Shen,et al. Low-dimensional embedding of fMRI datasets , 2007, NeuroImage.

[120] S. Shalev-Shwartz. Low ` 1-Norm and Guarantees on Sparsifiability , 2008 .

[121] Nir Ailon,et al. Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes , 2008, SODA '08.

[122] S. Muthukrishnan,et al. Relative-Error CUR Matrix Decompositions , 2007, SIAM J. Matrix Anal. Appl..

[123] Nikhil Srivastava,et al. Graph sparsification by effective resistances , 2008, SIAM J. Comput..

[124] J. Tropp. On the conditioning of random subdictionaries , 2008 .

[125] V. Rokhlin,et al. A fast randomized algorithm for overdetermined linear least-squares regression , 2008, Proceedings of the National Academy of Sciences.

[126] Christos Boutsidis,et al. Random Projections for the Nonnegative Least-Squares Problem , 2008, ArXiv.

[127] Patrick J. Wolfe,et al. On sparse representations of linear operators and the approximation of matrix products , 2007, 2008 42nd Annual Conference on Information Sciences and Systems.

[128] Amit Singer,et al. Dense Fast Random Projections and Lean Walsh Transforms , 2008, APPROX-RANDOM.

[129] Mark Tygert,et al. A Randomized Algorithm for Principal Component Analysis , 2008, SIAM J. Matrix Anal. Appl..

[130] Trac D. Tran,et al. A fast and efficient algorithm for low-rank approximation of a matrix , 2009, STOC '09.

[131] S. Zucker,et al. Accelerated dense random projections , 2009 .

[132] Emmanuel J. Candès,et al. Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[133] David P. Woodruff,et al. Numerical linear algebra in the streaming model , 2009, STOC '09.

[134] Petros Drineas,et al. CUR matrix decompositions for improved data analysis , 2009, Proceedings of the National Academy of Sciences.

[135] D. Needell. Randomized Kaczmarz solver for noisy linear systems , 2009, 0902.0958.

[136] Alex Gittens,et al. Error Bounds for Random Matrix Approximation Schemes , 2009, 0911.4108.

[137] Christos Boutsidis,et al. An improved approximation algorithm for the column subset selection problem , 2008, SODA.

[138] Malik Magdon-Ismail,et al. On selecting a maximum volume sub-matrix of a matrix and related problems , 2009, Theor. Comput. Sci..

[139] Pablo A. Parrilo,et al. Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[140] Luis Rademacher,et al. Efficient Volume Sampling for Row/Column Subset Selection , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[141] Emmanuel J. Candès,et al. The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[142] Nathan Halko,et al. An Algorithm for the Principal Component Analysis of Large Data Sets , 2010, SIAM J. Sci. Comput..

[143] C. Chui,et al. Article in Press Applied and Computational Harmonic Analysis a Randomized Algorithm for the Decomposition of Matrices , 2022 .

[144] Joel A. Tropp,et al. Improved Analysis of the subsampled Randomized Hadamard Transform , 2010, Adv. Data Sci. Adapt. Anal..

[145] Vladimir Rokhlin,et al. Randomized approximate nearest neighbors algorithm , 2011, Proceedings of the National Academy of Sciences.

[146] S. Muthukrishnan,et al. Faster least squares approximation , 2007, Numerische Mathematik.

[147] Amit Singer,et al. Dense Fast Random Projections and Lean Walsh Transforms , 2008, APPROX-RANDOM.

[148] Emmanuel J. Candès,et al. Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[149] Emmanuel J. Candès,et al. Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..