On the principal components of sample covariance matrices

We introduce a class of $$M \times M$$M×M sample covariance matrices $${\mathcal {Q}}$$Q which subsumes and generalizes several previous models. The associated population covariance matrix $$\Sigma = \mathbb {E}{\mathcal {Q}}$$Σ=EQ is assumed to differ from the identity by a matrix of bounded rank. All quantities except the rank of $$\Sigma - I_M$$Σ-IM may depend on $$M$$M in an arbitrary fashion. We investigate the principal components, i.e. the top eigenvalues and eigenvectors, of $${\mathcal {Q}}$$Q. We derive precise large deviation estimates on the generalized components $$\langle {\mathbf{{w}}} , {\varvec{\xi }_i}\rangle $$⟨w,ξi⟩ of the outlier and non-outlier eigenvectors $$\varvec{\xi }_i$$ξi. Our results also hold near the so-called BBP transition, where outliers are created or annihilated, and for degenerate or near-degenerate outliers. We believe the obtained rates of convergence to be optimal. In addition, we derive the asymptotic distribution of the generalized components of the non-outlier eigenvectors. A novel observation arising from our results is that, unlike the eigenvalues, the eigenvectors of the principal components contain information about the subcritical spikes of $$\Sigma $$Σ. The proofs use several results on the eigenvalues and eigenvectors of the uncorrelated matrix $${\mathcal {Q}}$$Q, satisfying $$\mathbb {E}{\mathcal {Q}} = I_M$$EQ=IM, as input: the isotropic local Marchenko–Pastur law established in Bloemendal et al. (Electron J Probab 19:1–53, 2014), level repulsion, and quantum unique ergodicity of the eigenvectors. The latter is a special case of a new universality result for the joint eigenvalue–eigenvector distribution.

[1]  V. Marčenko,et al.  DISTRIBUTION OF EIGENVALUES FOR SOME SETS OF RANDOM MATRICES , 1967 .

[2]  C. Tracy,et al.  Level-spacing distributions and the Airy kernel , 1992, hep-th/9211141.

[3]  E. Davies The Functional Calculus , 1995 .

[4]  C. Tracy,et al.  Mathematical Physics © Springer-Verlag 1996 On Orthogonal and Symplectic Matrix Ensembles , 1995 .

[5]  K. Johansson Shape Fluctuations and Random Matrices , 1999, math/9903134.

[6]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[7]  A. Soshnikov A Note on Universality of the Distribution of the Largest Eigenvalues in Certain Sample Covariance Matrices , 2001, math/0104113.

[8]  Noureddine El Karoui On the largest eigenvalue of Wishart matrices with identity covariance when n, p and p/n tend to infinity , 2003, math/0309355.

[9]  S. Péché,et al.  Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices , 2004, math/0403022.

[10]  J. W. Silverstein,et al.  Eigenvalues of large sample covariance matrices of spiked population models , 2004, math/0408165.

[11]  S. Péché The largest eigenvalue of small rank perturbations of Hermitian random matrices , 2004, math/0411487.

[12]  I. Johnstone High Dimensional Statistical Inference and Random Matrices , 2006, math/0611589.

[13]  Noureddine El Karoui Tracy–Widom limit for the largest eigenvalue of a large class of complex sample covariance matrices , 2005, math/0503109.

[14]  D. Paul ASYMPTOTICS OF SAMPLE EIGENSTRUCTURE FOR A LARGE DIMENSIONAL SPIKED COVARIANCE MODEL , 2007 .

[15]  S. Péché Universality results for largest eigenvalues of some sample covariance matrix ensembles , 2007, 0705.1701.

[16]  Alexei Borodin,et al.  Airy Kernel with Two Sets of Parameters in Directed Percolation and Random Matrix Theory , 2007, 0712.1086.

[17]  Z. Bai,et al.  Central limit theorems for eigenvalues in a spiked population model , 2008, 0806.2503.

[18]  Xavier Mestre,et al.  Improved Estimation of Eigenvalues and Eigenvectors of Covariance Matrices Using Their Sample Estimates , 2008, IEEE Transactions on Information Theory.

[19]  B. Nadler Finite sample approximation results for principal component analysis: a matrix perturbation approach , 2009, 0901.3245.

[20]  H. Yau,et al.  Universality of Sine-Kernel for Wigner Matrices with a Small Gaussian Perturbation , 2009, 0905.2089.

[21]  Raj Rao Nadakuditi,et al.  The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices , 2009, 0910.2120.

[22]  H. Yau,et al.  Bulk universality for generalized Wigner matrices , 2010, 1001.3453.

[23]  A. Guionnet,et al.  Large deviations of the extreme eigenvalues of random deformations of matrices , 2010, Probability Theory and Related Fields.

[24]  H. Yau,et al.  Rigidity of eigenvalues of generalized Wigner matrices , 2010, 1007.4652.

[25]  A. Guionnet,et al.  Fluctuations of the Extreme Eigenvalues of Finite Rank Deformations of Random Matrices , 2010, 1009.0145.

[26]  Terence Tao,et al.  Random matrices: Universal properties of eigenvectors , 2011, 1103.2801.

[27]  Jun Yin,et al.  The Isotropic Semicircle Law and Deformation of Wigner Matrices , 2011, 1110.6449.

[28]  Jun Yin,et al.  Eigenvector distribution of Wigner matrices , 2011, 1102.0057.

[29]  Alex Bloemendal,et al.  Limits of spiked random matrices II , 2011, 1109.3704.

[30]  N. Pillai,et al.  Universality of covariance matrices , 2011, 1110.2501.

[31]  A. Soshnikov,et al.  On finite rank deformations of Wigner matrices , 2011, 1103.3731.

[32]  A. Soshnikov,et al.  ON FINITE RANK DEFORMATIONS OF WIGNER MATRICES II: DELOCALIZED PERTURBATIONS , 2012, 1203.5130.

[33]  Raj Rao Nadakuditi,et al.  The singular values and vectors of low rank perturbations of large rectangular random matrices , 2011, J. Multivar. Anal..

[34]  Jun Yin,et al.  Delocalization and Diffusion Profile for Random Band Matrices , 2012, 1205.5669.

[35]  H. Yau,et al.  The local semicircle law for a general class of random matrices , 2012, 1212.0164.

[36]  Jianfeng Yao,et al.  On sample eigenvalues in a generalized spiked population model , 2008, J. Multivar. Anal..

[37]  The local circular law III: general case , 2012, 1212.6599.

[38]  Dai Shi,et al.  Asymptotic Joint Distribution of Extreme Sample Eigenvalues and Eigenvectors in the Spiked Population Model , 2013, 1304.6113.

[39]  H. Yau,et al.  The Eigenvector Moment Flow and Local Quantum Unique Ergodicity , 2013, 1312.1301.

[40]  Wang Zhou,et al.  Universality for the largest eigenvalue of sample covariance matrices with general population , 2013, 1304.5690.

[41]  Alex Bloemendal,et al.  Limits of spiked random matrices I , 2010, Probability Theory and Related Fields.

[42]  Antti Knowles,et al.  Averaging Fluctuations in Resolvents of Random Band Matrices , 2012, 1205.5664.

[43]  H. Yau,et al.  Isotropic local laws for sample covariance and generalized Wigner matrices , 2013, 1308.5729.

[44]  H. Yau,et al.  Edge Universality of Beta Ensembles , 2013, 1306.5728.

[45]  Jun Yin,et al.  The outliers of a deformed Wigner matrix , 2012, 1207.5619.