On randomized sketching algorithms and the Tracy–Widom law

There is an increasing body of work exploring the integration of random projection into algorithms for numerical linear algebra. The primary motivation is to reduce the overall computational cost of processing large datasets. A suitably chosen random projection can be used to embed the original dataset in a lower-dimensional space such that key properties of the original dataset are retained. These algorithms are often referred to as sketching algorithms, as the projected dataset can be used as a compressed representation of the full dataset. We show that random matrix theory, in particular the Tracy-Widom law, is useful for describing the operating characteristics of sketching algorithms in the tall-data regime when n d. Asymptotic large sample results are of particular interest as this is the regime where sketching is most useful for data compression. In particular, we develop asymptotic approximations for the success rate in generating random subspace embeddings and the convergence probability of iterative sketching algorithms. We test a number of sketching algorithms on real large high-dimensional datasets and find that the asymptotic expressions give accurate predictions of the empirical performance.

[1]  Bernard Chazelle,et al.  The Fast Johnson--Lindenstrauss Transform and Approximate Nearest Neighbors , 2009, SIAM J. Comput..

[2]  I. Johnstone High Dimensional Statistical Inference and Random Matrices , 2006, math/0611589.

[3]  Mert Pilanci,et al.  Limiting Spectrum of Randomized Hadamard Transform and Optimal Iterative Sketching Methods , 2020, ArXiv.

[4]  S. Geman A Limit Theorem for the Norm of Random Matrices , 1980 .

[5]  Tamás Sarlós,et al.  Improved Approximation Algorithms for Large Matrices via Random Projections , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[6]  Michael W. Mahoney,et al.  A Statistical Perspective on Randomized Sketching for Ordinary Least-Squares , 2014, J. Mach. Learn. Res..

[7]  Graham Cormode,et al.  Sketch Techniques for Approximate Query Processing , 2010 .

[8]  Lars T. Westlye,et al.  Random Projection for Fast and Efficient Multivariate Correlation Analysis of High-Dimensional Data: A New Approach , 2016, Front. Genet..

[9]  Roberta Falcone,et al.  Matrix sketching for supervised classification with imbalanced classes , 2019, Data Min. Knowl. Discov..

[10]  William J. Astle,et al.  Statistical properties of sketching algorithms , 2017, Biometrika.

[11]  David P. Woodruff,et al.  Low rank approximation and regression in input sparsity time , 2012, STOC '13.

[12]  J. W. Silverstein,et al.  Spectral Analysis of Large Dimensional Random Matrices , 2009 .

[13]  Michael W. Mahoney,et al.  Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression , 2012, STOC '13.

[14]  Chao Yang,et al.  ARPACK users' guide - solution of large-scale eigenvalue problems with implicitly restarted Arnoldi methods , 1998, Software, environments, tools.

[15]  Zongming Ma,et al.  Accuracy of the Tracy–Widom limits for the extreme eigenvalues in white Wishart matrices , 2012, 1203.0839.

[16]  David P. Woodru Sketching as a Tool for Numerical Linear Algebra , 2014 .

[17]  Alan Edelman,et al.  Random Matrix Theory and Its Innovative Applications , 2013 .

[18]  David P. Woodruff,et al.  An Empirical Evaluation of Sketching for Numerical Linear Algebra , 2018, KDD.

[19]  Michael W. Mahoney,et al.  Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments , 2015, Proceedings of the IEEE.

[20]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[21]  A. Edelman Eigenvalues and condition numbers of random matrices , 1988 .

[22]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[23]  David P. Woodruff Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..

[24]  Rémi Bardenet,et al.  A note on replacing uniform subsampling by random projections in MCMC for linear regression of tall datasets , 2015 .

[25]  Petros Drineas,et al.  Structural Properties Underlying High-Quality Randomized Numerical Linear Algebra Algorithms , 2016, Handbook of Big Data.

[26]  Kenneth Ward Church,et al.  Very sparse random projections , 2006, KDD '06.

[27]  Steven L. Brunton,et al.  Randomized Matrix Decompositions using R , 2016, Journal of Statistical Software.

[28]  Ping Ma,et al.  A statistical perspective on algorithmic leveraging , 2013, J. Mach. Learn. Res..

[29]  Joel A. Tropp,et al.  Improved Analysis of the subsampled Randomized Hadamard Transform , 2010, Adv. Data Sci. Adapt. Anal..

[30]  Shusen Wang,et al.  Error Estimation for Randomized Least-Squares Algorithms via the Bootstrap , 2018, ICML.

[31]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[32]  Michael Saunders,et al.  RANDOMIZED ALGORITHMS FOR LARGE-SCALE STRONGLY OVER-DETERMINED LINEAR REGRESSION PROBLEMS A DISSERTATION SUBMITTED TO THE INSTITUTE FOR COMPUTATIONAL AND MATHEMATICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF , 2014 .

[33]  Charles L. Byrne,et al.  Applied Iterative Methods , 2007 .

[34]  Christian Sohler,et al.  Random projections for Bayesian regression , 2015, Statistics and Computing.

[35]  J. W. Silverstein The Smallest Eigenvalue of a Large Dimensional Wishart Matrix , 1985 .

[36]  Edgar Dobriban,et al.  A New Theory for Sketching in Linear Regression , 2018, ArXiv.

[37]  Martin J. Wainwright,et al.  Iterative Hessian Sketch: Fast and Accurate Solution Approximation for Constrained Least-Squares , 2014, J. Mach. Learn. Res..

[38]  Michael W. Mahoney Randomized Algorithms for Matrices and Data , 2011, Found. Trends Mach. Learn..

[39]  S. Muthukrishnan,et al.  Sampling algorithms for l2 regression and applications , 2006, SODA '06.

[40]  Marco Chiani,et al.  On the Probability That All Eigenvalues of Gaussian, Wishart, and Double Wishart Random Matrices Lie Within an Interval , 2015, IEEE Transactions on Information Theory.

[41]  Suresh Venkatasubramanian,et al.  The Johnson-Lindenstrauss Transform: An Empirical Study , 2011, ALENEX.

[42]  Alexander J. Smola,et al.  Fastfood - Computing Hilbert Space Expansions in loglinear time , 2013, ICML.

[43]  Michael A. Saunders,et al.  LSRN: A Parallel Iterative Solver for Strongly Over- or Underdetermined Systems , 2011, SIAM J. Sci. Comput..

[44]  Huy L. Nguyen,et al.  OSNAP: Faster Numerical Linear Algebra Algorithms via Sparser Subspace Embeddings , 2012, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[45]  Robert Kohn,et al.  Subsampling MCMC - an Introduction for the Survey Statistician , 2018, Sankhya A.