An Introduction to Matrix Concentration Inequalities

In recent years, random matrices have come to play a major role in computational mathematics, but most of the classical areas of random matrix theory remain the province of experts. Over the last decade, with the advent of matrix concentration inequalities, research has advanced to the point where we can conquer many (formerly) challenging problems with a page or two of arithmetic. The aim of this monograph is to describe the most successful methods from this area along with some interesting examples that these techniques can illuminate.

[1]  J. Wishart THE GENERALISED PRODUCT MOMENT DISTRIBUTION IN SAMPLES FROM A NORMAL MULTIVARIATE POPULATION , 1928 .

[2]  Karl Löwner Über monotone Matrixfunktionen , 1934 .

[3]  F. Kraus Über konvexe Matrixfunktionen , 1936 .

[4]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[5]  J. Neumann,et al.  Numerical inverting of matrices of high order. II , 1951 .

[6]  H. Chernoff A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[7]  E. Wigner Characteristic Vectors of Bordered Matrices with Infinite Dimensions I , 1955 .

[8]  S. Sherman,et al.  Monotone and convex operator functions , 1955 .

[9]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[10]  S. M. Ali,et al.  A General Class of Coefficients of Divergence of One Distribution from Another , 1966 .

[11]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[12]  V. Marčenko,et al.  DISTRIBUTION OF EIGENVALUES FOR SOME SETS OF RANDOM MATRICES , 1967 .

[13]  C. Stein A bound for the error in the normal approximation to the distribution of a sum of dependent random variables , 1972 .

[14]  G. Lindblad Entropy, information and quantum measurements , 1973 .

[15]  H. Epstein Remarks on two theorems of E. Lieb , 1973 .

[16]  N. Tomczak-Jaegermann The moduli of smoothness and convexity and the Rademacher averages of the trace classes $S_{p}$ (1≤p<∞) , 1974 .

[17]  W. Pusz,et al.  Functional calculus for sesquilinear forms and the purification map , 1975 .

[18]  D. Freedman On Tail Probabilities for Martingales , 1975 .

[19]  X. Fernique Regularite des trajectoires des fonctions aleatoires gaussiennes , 1975 .

[20]  A. Connes,et al.  Entropy for automorphisms of II1 von neumann algebras , 1975 .

[21]  Karel Hrbacek,et al.  A New Proof that π , 1979, Math. Log. Q..

[22]  T. Andô Concavity of certain maps on positive definite matrices and applications to Hadamard products , 1979 .

[23]  T. Andô,et al.  Means of positive linear operators , 1980 .

[24]  B. Parlett The Symmetric Eigenvalue Problem , 1981 .

[25]  G. Pisier Remarques sur un résultat non publié de B. Maurey , 1981 .

[26]  G. Grimmett,et al.  Probability and random processes , 2002 .

[27]  R. Muirhead Aspects of Multivariate Statistical Theory , 1982, Wiley Series in Probability and Statistics.

[28]  F. Hansen,et al.  Jensen's inequality for operators and Löwner's theorem , 1982 .

[29]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[30]  W. B. Johnson,et al.  Extensions of Lipschitz mappings into Hilbert space , 1984 .

[31]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[32]  P. Gallagher Pair correlation of zeros of the zeta function. , 1985 .

[33]  B. Carl Inequalities of Bernstein-Jackson-type and the degree of compactness of operators in Banach spaces , 1985 .

[34]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[35]  Y. Gordon Some inequalities for Gaussian processes and applications , 1985 .

[36]  J. Bourgain On lipschitz embedding of finite metric spaces in Hilbert space , 1985 .

[37]  D. Petz Quasi-entropies for finite quantum systems , 1986 .

[38]  Aharon Ben-Tal,et al.  Lectures on modern convex optimization , 1987 .

[39]  J. Bourgain,et al.  Invertibility of ‘large’ submatrices with applications to the geometry of Banach spaces and harmonic analysis , 1987 .

[40]  G. Pisier The volume of convex bodies and Banach space geometry , 1989 .

[41]  J. Bourgain,et al.  On a problem of Kadison and Singer. , 1991 .

[42]  B. Bollobás THE VOLUME OF CONVEX BODIES AND BANACH SPACE GEOMETRY (Cambridge Tracts in Mathematics 94) , 1991 .

[43]  Charles R. Johnson,et al.  Topics in Matrix Analysis , 1991 .

[44]  M. Talagrand,et al.  Probability in Banach Spaces: Isoperimetry and Processes , 1991 .

[45]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[46]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[47]  Z. Bai,et al.  Limit of the smallest eigenvalue of a large dimensional sample covariance matrix , 1993 .

[48]  I. Pinelis OPTIMUM BOUNDS FOR THE DISTRIBUTIONS OF MARTINGALES IN BANACH SPACES , 1994, 1208.2200.

[49]  D. Petz A survey of certain trace inequalities , 1994 .

[50]  Nathan Linial,et al.  The geometry of graphs and some of its algorithmic applications , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[51]  Rajeev Motwani,et al.  Randomized Algorithms , 1995, SIGA.

[52]  Peter L. Bartlett,et al.  Efficient agnostic learning of neural networks with bounded fan-in , 1996, IEEE Trans. Inf. Theory.

[53]  M. Rudelson Random Vectors in the Isotropic Position , 1996, math/9608208.

[54]  G. Lugosi,et al.  On Concentration-of-Measure Inequalities , 1998 .

[55]  Alan M. Frieze,et al.  Fast Monte-Carlo algorithms for finding low-rank approximations , 1998, Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280).

[56]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[57]  Alan M. Frieze,et al.  Clustering in large graphs and matrices , 1999, SODA '99.

[58]  Yoav Seginer,et al.  The Expected Norm of Random Matrices , 2000, Combinatorics, Probability and Computing.

[59]  Christopher K. I. Williams,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[60]  Arkadi Nemirovski,et al.  Lectures on modern convex optimization - analysis, algorithms, and engineering applications , 2001, MPS-SIAM series on optimization.

[61]  A. Buchholz Operator Khintchine inequality in non-commutative probability , 2001 .

[62]  Petros Drineas,et al.  Fast Monte-Carlo algorithms for approximate matrix multiplication , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[63]  Dimitris Achlioptas,et al.  Fast computation of low rank matrix approximations , 2001, STOC '01.

[64]  Gordon F. Royle,et al.  Algebraic Graph Theory , 2001, Graduate texts in mathematics.

[65]  S. Szarek,et al.  Chapter 8 - Local Operator Theory, Random Matrices and Banach Spaces , 2001 .

[66]  M. Ruskai Inequalities for quantum entropy: A review with conditions for equality , 2002, quant-ph/0205064.

[67]  Sudipto Guha,et al.  Near-optimal sparse fourier representations via sampling , 2002, STOC '02.

[68]  W. Thirring,et al.  Quantum mathematical physics : atoms, molecules and large systems , 2002 .

[69]  W. Thirring Quantum Mathematical Physics , 2002 .

[70]  E. Lieb,et al.  A Minkowski Type Trace Inequality and Strong Subadditivity of Quantum Entropy , 2007, math/0701352.

[71]  P. MassartLedoux,et al.  Concentration Inequalities Using the Entropy Method , 2002 .

[72]  Rudolf Ahlswede,et al.  Strong converse for identification via quantum channels , 2000, IEEE Trans. Inf. Theory.

[73]  Gábor Lugosi,et al.  Concentration Inequalities , 2008, COLT.

[74]  Andreas Maurer A bound on the deviation probability for sums of non-negative random variables. , 2003 .

[75]  M. Junge,et al.  Noncommutative Burkholder/Rosenthal inequalities , 2003 .

[76]  F. Hansen,et al.  Jensen's Operator Inequality , 2002, math/0204049.

[77]  A. Dembo,et al.  Spectral measure of large random Hankel, Markov and Toeplitz matrices , 2003, math/0307330.

[78]  Antonia Maria Tulino,et al.  Random Matrix Theory and Wireless Communications , 2004, Found. Trends Commun. Inf. Theory.

[79]  Shang-Hua Teng,et al.  Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems , 2003, STOC '04.

[80]  R. Lata,et al.  SOME ESTIMATES OF NORMS OF RANDOM MATRICES , 2004 .

[81]  M. Ruskai Erratum: Inequalities for quantum entropy: A review with conditions for equality [J. Math. Phys. 43, 4358 (2002)] , 2005 .

[82]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[83]  A. Buchholz Optimal Constants in Khintchine Type Inequalities for Fermions, Rademachers and q-Gaussian Operators , 2005 .

[84]  E. Lieb,et al.  Stronger subadditivity of entropy , 2004, math-ph/0412009.

[85]  M. Junge,et al.  On the Best Constants in Some Non‐Commutative Martingale Inequalities , 2005, math/0505309.

[86]  S. Boucheron,et al.  Moment inequalities for functions of independent random variables , 2005, math/0503651.

[87]  S. Chatterjee Concentration Inequalities With Exchangeable Pairs , 2005 .

[88]  D. Spielman,et al.  Smoothed Analysis of the Condition Numbers and Growth Factors of Matrices , 2003, SIAM Journal on Matrix Analysis and Applications.

[89]  Tamás Sarlós,et al.  Improved Approximation Algorithms for Large Matrices via Random Projections , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[90]  Emmanuel J. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[91]  R. Speicher,et al.  Lectures on the Combinatorics of Free Probability: The free commutator , 2006 .

[92]  Sanjeev Arora,et al.  A Fast Random Sampling Algorithm for Sparsifying Matrices , 2006, APPROX-RANDOM.

[93]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[94]  M. Rudelson,et al.  Sparse reconstruction by convex relaxation: Fourier and Gaussian measurements , 2006, 2006 40th Annual Conference on Information Sciences and Systems.

[95]  J. Tropp The random paving property for uniformly bounded matrices , 2006, math/0612070.

[96]  Petros Drineas,et al.  Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication , 2006, SIAM J. Comput..

[97]  S. Chatterjee Stein’s method for concentration inequalities , 2006, math/0604352.

[98]  Mark Rudelson,et al.  Sampling from large matrices: An approach through geometric functional analysis , 2005, JACM.

[99]  M. Meckes On the spectral norm of a random Toeplitz matrix , 2007, math/0703134.

[100]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[101]  R. Bhatia Positive Definite Matrices , 2007 .

[102]  Inderjit S. Dhillon,et al.  Matrix Nearness Problems with Bregman Divergences , 2007, SIAM J. Matrix Anal. Appl..

[103]  Arkadi Nemirovski,et al.  Sums of random symmetric matrices and quadratic optimization under orthogonality constraints , 2007, Math. Program..

[104]  M. Junge,et al.  Noncommutative Burkholder/Rosenthal inequalities II: Applications , 2007, 0705.1952.

[105]  E. Lieb,et al.  A Minkowski Type Trace Inequality and Strong Subadditivity of Quantum Entropy II: Convexity and Concavity , 2007, 0710.4167.

[106]  J. Tropp Norms of Random Submatrices and Sparse Approximation , 2008 .

[107]  Alexandre d'Aspremont,et al.  Subsampling algorithms for semidefinite programming , 2008, 0803.1990.

[108]  J. Tropp On the Linear Independence of Spikes and Sines , 2007, 0709.0517.

[109]  S. Shalev-Shwartz Low ` 1-Norm and Guarantees on Sparsifiability , 2008 .

[110]  Nicholas J. Higham,et al.  Functions of matrices - theory and computation , 2008 .

[111]  Klas Markström,et al.  Expansion properties of random Cayley graphs and vertex transitive graphs via matrix martingales , 2008, Random Struct. Algorithms.

[112]  N. Higham Functions of Matrices: Theory and Computation (Other Titles in Applied Mathematics) , 2008 .

[113]  Benjamin Recht,et al.  Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning , 2008, NIPS.

[114]  J. Tropp On the conditioning of random subdictionaries , 2008 .

[115]  K. Markström,et al.  Expansion properties of random Cayley graphs and vertex transitive graphs via matrix martingales , 2008 .

[116]  Andreas J. Winter,et al.  Counterexamples to the Maximal p-Norm Multiplicativity Conjecture for all p > 1 , 2008, ArXiv.

[117]  Superadditivity of communication capacity using entangled inputs , 2009 .

[118]  Bernard Chazelle,et al.  The Fast Johnson--Lindenstrauss Transform and Approximate Nearest Neighbors , 2009, SIAM J. Comput..

[119]  E. Carlen TRACE INEQUALITIES AND QUANTUM ENTROPY: An introductory course , 2009 .

[120]  J. W. Silverstein,et al.  Spectral Analysis of Large Dimensional Random Matrices , 2009 .

[121]  Alex Gittens,et al.  Error Bounds for Random Matrix Approximation Schemes , 2009, 0911.4108.

[122]  E. Effros A matrix convexity approach to some celebrated quantum inequalities , 2008, Proceedings of the National Academy of Sciences.

[123]  R. Oliveira The spectrum of random k-lifts of large graphs (with possibly large k) , 2009, 0911.4741.

[124]  Dénes Petz,et al.  From f-divergence to quantum quasi-entropies and their use , 2009, Entropy.

[125]  R. Oliveira Concentration of the adjacency matrix and of the Laplacian in random graphs with independent edges , 2009, 0911.0600.

[126]  Faperj Sums of random Hermitian matrices and an inequality by Rudelson , 2010 .

[127]  P. Forrester Log-Gases and Random Matrices , 2010 .

[128]  Vincent Nesme,et al.  Note on sampling without replacing from a finite collection of matrices , 2010, ArXiv.

[129]  Avner Magen,et al.  Low rank matrix-valued chernoff bounds and approximate matrix multiplication , 2010, SODA '11.

[130]  Mark D. Reid,et al.  Information, Divergence and Risk for Binary Experiments , 2009, J. Mach. Learn. Res..

[131]  A. Ebadian,et al.  Perspectives of matrix convex functions , 2011, Proceedings of the National Academy of Sciences.

[132]  D. Petz Matrix Analysis with some Applications , 2011 .

[133]  Joel A. Tropp,et al.  From joint convexity of quantum relative entropy to a concavity theorem of Lieb , 2011, ArXiv.

[134]  J. Tropp User-Friendly Tail Bounds for Matrix Martingales , 2011 .

[135]  Michael W. Mahoney Randomized Algorithms for Matrices and Data , 2011, Found. Trends Mach. Learn..

[136]  V. Koltchinskii,et al.  Oracle inequalities in empirical risk minimization and sparse recovery problems , 2011 .

[137]  Joel A. Tropp,et al.  Improved Analysis of the subsampled Randomized Hadamard Transform , 2010, Adv. Data Sci. Adapt. Anal..

[138]  Sara van de Geer,et al.  Statistics for High-Dimensional Data , 2011 .

[139]  Stanislav Minsker On Some Extensions of Bernstein's Inequality for Self-adjoint Operators , 2011, 1112.5448.

[140]  J. Tropp FREEDMAN'S INEQUALITY FOR MATRIX MARTINGALES , 2011, 1101.3039.

[141]  Joseph F. Grcar,et al.  John von Neumann's Analysis of Gaussian Elimination and the Origins of Modern Numerical Analysis , 2011, SIAM Rev..

[142]  David Gross,et al.  Recovering Low-Rank Matrices From Few Coefficients in Any Basis , 2009, IEEE Transactions on Information Theory.

[143]  Richard Y. Chen,et al.  The Masked Sample Covariance Estimator: An Analysis via Matrix Concentration Inequalities , 2011, 1109.1637.

[144]  Petros Drineas,et al.  A note on element-wise matrix sparsification via a matrix-valued Bernstein inequality , 2010, Inf. Process. Lett..

[145]  St'ephane Chr'etien,et al.  Invertibility of random submatrices via tail decoupling and a Matrix Chernoff Inequality , 2011, 1103.3063.

[146]  Anthony Man-Cho So,et al.  Moment inequalities for sums of random matrices and their applications in optimization , 2011, Math. Program..

[147]  Alex Gittens,et al.  TAIL BOUNDS FOR ALL EIGENVALUES OF A SUM OF RANDOM MATRICES , 2011, 1104.4513.

[148]  Sara van de Geer,et al.  Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .

[149]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[150]  Pablo A. Parrilo,et al.  Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[151]  Benjamin Recht,et al.  A Simpler Approach to Matrix Completion , 2009, J. Mach. Learn. Res..

[152]  Harish Karnick,et al.  Random Feature Maps for Dot Product Kernels , 2012, AISTATS.

[153]  Deanna Needell,et al.  Paved with Good Intentions: Analysis of a Randomized Block Kaczmarz Method , 2012, ArXiv.

[154]  M. Junge,et al.  Noncommutative martingale deviation and Poincaré type inequalities with applications , 2012, 1211.3209.

[155]  Joel A. Tropp,et al.  User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[156]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[157]  Pablo A. Parrilo,et al.  The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[158]  T. Tao Topics in Random Matrix Theory , 2012 .

[159]  Daniel J. Hsu,et al.  Tail inequalities for sums of random matrices that depend on the intrinsic dimension , 2012 .

[160]  S. Riemer,et al.  On the expectation of the norm of random matrices with non-identically distributed entries , 2012, 1203.3713.

[161]  Holger Rauhut,et al.  A Mathematical Introduction to Compressive Sensing , 2013, Applied and Numerical Harmonic Analysis.

[162]  Anastasios Zouzias,et al.  Randomized Primitives For Linear Algebra and Applications , 2013 .

[163]  Edo Liberty,et al.  Near-Optimal Entrywise Sampling for Data Matrices , 2013, NIPS.

[164]  David P. Woodruff,et al.  Low rank approximation and regression in input sparsity time , 2013, STOC '13.

[165]  Joel A. Tropp,et al.  Subadditivity of Matrix phi-Entropy and Concentration of Random Matrices , 2013, ArXiv.

[166]  D. Spielman,et al.  Interlacing Families II: Mixed Characteristic Polynomials and the Kadison-Singer Problem , 2013, 1306.3969.

[167]  V. Koltchinskii,et al.  Bounding the smallest singular value of a random matrix without concentration , 2013, 1312.3580.

[168]  Roberto Imbuzeiro Oliveira,et al.  The lower tail of random quadratic forms with applications to ordinary least squares , 2013, ArXiv.

[169]  Joel A. Tropp,et al.  Living on the edge: A geometric theory of phase transitions in convex optimization , 2013, ArXiv.

[170]  Arnab Sen,et al.  The top eigenvalue of the random Toeplitz matrix and the sine kernel , 2011, 1109.5494.

[171]  M. Junge,et al.  Noncommutative Bennett and Rosenthal inequalities , 2011, 1111.1027.

[172]  Joel A. Tropp,et al.  The achievable performance of convex demixing , 2013, ArXiv.

[173]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[174]  Michael I. Jordan,et al.  Matrix concentration inequalities via the method of exchangeable pairs , 2012, 1201.6002.

[175]  Siddharth Barman An Approximate Version of Carath\'{e}odory's Theorem with Applications to Approximating Nash Equilibria and Dense Bipartite Subgraphs , 2014 .

[176]  Abhisek Kundu,et al.  A Note on Randomized Element-wise Matrix Sparsification , 2014, ArXiv.

[177]  J. Fournier Noncommutative Khintchine and Paley inequalities via generic factorization , 2014, 1407.2578.

[178]  J. Tropp,et al.  Efron–Stein inequalities for random matrices , 2014, 1408.3470.

[179]  Bernhard Schölkopf,et al.  Randomized Nonlinear Component Analysis , 2014, ICML.

[180]  J. Tropp Convex recovery of a structured signal from independent random linear measurements , 2014, ArXiv.

[181]  F. Hiai,et al.  Introduction to Matrix Analysis and Applications , 2014 .

[182]  Efficient rounding for the noncommutative Grothendieck inequality , 2012, 1210.7656.

[183]  Dennis DeCoste,et al.  Compact Random Feature Maps , 2013, ICML.

[184]  A. Bandeira,et al.  Sharp nonasymptotic bounds on the norm of random matrices with independent entries , 2014, 1408.6185.

[185]  David P. Woodruff Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..

[186]  Joel A. Tropp,et al.  Sharp Recovery Bounds for Convex Demixing, with Applications , 2012, Found. Comput. Math..

[187]  Siddharth Barman,et al.  Approximating Nash Equilibria and Dense Bipartite Subgraphs via an Approximate Version of Caratheodory's Theorem , 2015, STOC.

[188]  慧 廣瀬 A Mathematical Introduction to Compressive Sensing , 2015 .

[189]  Emmanuel Abbe,et al.  Community detection in general stochastic block models: fundamental limits and efficient recovery algorithms , 2015, ArXiv.

[190]  J. Tropp Second-Order Matrix Concentration Inequalities , 2015, 1504.05919.

[191]  Michael W. Mahoney,et al.  Revisiting the Nystrom Method for Improved Large-scale Machine Learning , 2013, J. Mach. Learn. Res..

[192]  Emmanuel Abbe,et al.  Exact Recovery in the Stochastic Block Model , 2014, IEEE Transactions on Information Theory.

[193]  H.,et al.  Convex Trace Functions and the Wigner-Yanase-Dyson Conjecture , 2022 .