Estimation of the population spectral distribution from a large dimensional sample covariance matrix

Abstract This paper introduces a new method to estimate the spectral distribution of a population covariance matrix from high-dimensional data. The method is founded on a meaningful generalization of the seminal Marcenko–Pastur equation, originally defined in the complex plane, to the real line. Beyond its easy implementation and the established asymptotic consistency, the new estimator outperforms two existing estimators from the literature in almost all the situations tested in a simulation experiment. An application to the analysis of the correlation matrix of S&P 500 daily stock returns is also given.

[1]  Xavier Mestre,et al.  Improved Estimation of Eigenvalues and Eigenvectors of Covariance Matrices Using Their Sample Estimates , 2008, IEEE Transactions on Information Theory.

[2]  Z. Bai,et al.  On the limit of the largest eigenvalue of the large dimensional sample covariance matrix , 1988 .

[3]  J. W. Silverstein,et al.  No eigenvalues outside the support of the limiting spectral distribution of large-dimensional sample covariance matrices , 1998 .

[4]  Z. Bai,et al.  CLT for linear spectral statistics of large dimensional sample covariance matrices with dependent data , 2017, Statistical Papers.

[5]  E. C. Titchmarsh,et al.  The theory of functions , 1933 .

[6]  Rene F. Swarttouw,et al.  Orthogonal polynomials , 2020, NIST Handbook of Mathematical Functions.

[7]  Bernard Delyon,et al.  On a model selection problem from high-dimensional sample covariance matrices , 2011, J. Multivar. Anal..

[8]  J. W. Silverstein,et al.  On the empirical distribution of eigenvalues of a class of large dimensional random matrices , 1995 .

[9]  G. Szegö A problem concerning orthogonal polynomials , 1935 .

[10]  V. Marčenko,et al.  DISTRIBUTION OF EIGENVALUES FOR SOME SETS OF RANDOM MATRICES , 1967 .

[11]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[12]  Y. Yin Limiting spectral distribution for a class of random matrices , 1986 .

[13]  Noureddine El Karoui Spectrum estimation for large dimensional covariance matrices using random matrix theory , 2006, math/0609418.

[14]  J. W. Silverstein Strong convergence of the empirical distribution of eigenvalues of large dimensional random matrices , 1995 .

[15]  A. Dempster A HIGH DIMENSIONAL TWO SAMPLE SIGNIFICANCE TEST , 1958 .

[16]  J. W. Silverstein,et al.  Analysis of the limiting spectral distribution of large dimensional random matrices , 1995 .

[17]  Z. Bai,et al.  Limit of the smallest eigenvalue of a large dimensional sample covariance matrix , 1993 .

[18]  Cedric E. Ginestet Spectral Analysis of Large Dimensional Random Matrices, 2nd edn , 2012 .

[19]  Alexandru Nica,et al.  Lectures on the Combinatorics of Free Probability , 2006 .

[20]  A. Dempster A significance test for the separation of two highly multivariate small samples , 1960 .

[21]  Jianfeng Yao,et al.  ON ESTIMATION OF THE POPULATION SPECTRAL DISTRIBUTION FROM A HIGH‐DIMENSIONAL SAMPLE COVARIANCE MATRIX , 2010 .

[22]  J. W. Silverstein,et al.  Spectral Analysis of Large Dimensional Random Matrices , 2009 .

[23]  A. Edelman,et al.  Statistical eigen-inference from large Wishart matrices , 2007, math/0701314.