Hypothesis testing for high-dimensional covariance matrices

This paper discusses the problem of testing for high-dimensional covariance matrices. Tests for an identity matrix and for the equality of two covariance matrices are considered when the data dimension and the sample size are both large. Most importantly, the dimension can be much larger than the sample size. The proposed test statistics are built upon the Stieltjes transform of the spectral distribution of the sample covariance matrix. We prove that the proposed statistics are asymptotically chi-square distributed under the null hypotheses, and normally distributed under the alternative hypotheses. Simulation results show that for finite dimension and sample size the proposed tests outperform some existing methods in various cases.

[1]  Olivier Ledoit,et al.  Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size , 2002 .

[2]  T. Cai,et al.  Optimal hypothesis testing for high dimensional covariance matrices , 2012, 1205.4219.

[3]  Jianfeng Yao,et al.  Estimation of the population spectral distribution from a large dimensional sample covariance matrix , 2013, 1302.0355.

[4]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[5]  R Fisher,et al.  Design of Experiments , 1936 .

[6]  J. W. Silverstein,et al.  On the empirical distribution of eigenvalues of a class of large dimensional random matrices , 1995 .

[7]  Dietrich von Rosen,et al.  Some tests for the covariance matrix with fewer observations than the dimension under non-normality , 2011, J. Multivar. Anal..

[8]  Hirokazu Yanagihara,et al.  Testing the equality of several covariance matrices with fewer observations than the dimension , 2010, J. Multivar. Anal..

[9]  S. John Some optimal multivariate tests , 1971 .

[10]  Z. Bai,et al.  CLT for linear spectral statistics of large dimensional sample covariance matrices with dependent data , 2017, Statistical Papers.

[11]  H. Nagao,et al.  On Some Test Criteria for Covariance Matrix , 1973 .

[12]  Marcelo J. Moreira,et al.  Asymptotic power of sphericity tests for high-dimensional data , 2013, 1306.4867.

[13]  J. W. Silverstein,et al.  Spectral Analysis of Large Dimensional Random Matrices , 2009 .

[14]  I. Johnstone On the distribution of the largest principal component , 2000 .

[15]  T. Cai,et al.  Two-Sample Covariance Matrix Testing and Support Recovery in High-Dimensional and Sparse Settings , 2013 .

[16]  Song-xi Chen,et al.  Tests for High-Dimensional Covariance Matrices , 2010, Random Matrices: Theory and Applications.

[17]  Z. Bai,et al.  Corrections to LRT on large-dimensional covariance matrix by RMT , 2009, 0902.0552.

[18]  M. Srivastava Some Tests Concerning the Covariance Matrix in High Dimensional Data , 2005 .

[19]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .

[20]  E. Pitman SIGNIFICANCE TESTS WHICH MAY BE APPLIED TO SAMPLES FROM ANY POPULATIONS III. THE ANALYSIS OF VARIANCE TEST , 1938 .

[21]  James R. Schott,et al.  A test for the equality of covariance matrices when the dimension is large relative to the sample sizes , 2007, Comput. Stat. Data Anal..

[22]  Guillaume Lecu'e,et al.  Empirical risk minimization is optimal for the convex aggregation problem , 2013, 1312.4349.

[23]  J. W. Silverstein Strong convergence of the empirical distribution of eigenvalues of large dimensional random matrices , 1995 .

[24]  Thomas J. Fisher On testing for an identity covariance matrix when the dimensionality equals or exceeds the sample size , 2012 .

[25]  Jun Yu Li,et al.  Two Sample Tests for High Dimensional Covariance Matrices , 2012, 1206.0917.

[26]  Holger Dette,et al.  A note on testing the covariance matrix for large dimension , 2005 .