Optimal estimation and rank detection for sparse spiked covariance matrices

This paper considers a sparse spiked covariance matrix model in the high-dimensional setting and studies the minimax estimation of the covariance matrix and the principal subspace as well as the minimax rank detection. The optimal rate of convergence for estimating the spiked covariance matrix under the spectral norm is established, which requires significantly different techniques from those for estimating other structured covariance matrices such as bandable or sparse covariance matrices. We also establish the minimax rate under the spectral norm for estimating the principal subspace, the primary object of interest in principal component analysis. In addition, the optimal rate for the rank detection boundary is obtained. This result also resolves the gap in a recent paper by Berthet and Rigollet (Ann Stat 41(4):1780–1815, 2013) where the special case of rank one is considered.

[1]  Chandler Davis The rotation of eigenvectors by a perturbation , 1963 .

[2]  W. Kahan,et al.  The Rotation of Eigenvectors by a Perturbation. III , 1970 .

[3]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[4]  Thomas Kailath,et al.  Detection of signals by information theoretic criteria , 1985, IEEE Trans. Acoust. Speech Signal Process..

[5]  L. L. Cam,et al.  Asymptotic methods in statistical theory , 1986 .

[6]  P. Massart,et al.  HUNGARIAN CONSTRUCTIONS FROM THE NONASYMPTOTIC VIEWPOINT , 1989 .

[7]  G. Stewart,et al.  Matrix Perturbation Theory , 1990 .

[8]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[9]  S. Szarek,et al.  Chapter 8 - Local Operator Theory, Random Matrices and Banach Spaces , 2001 .

[10]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[11]  D. Reich,et al.  Population Structure and Eigenanalysis , 2006, PLoS genetics.

[12]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[13]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[14]  Jianqing Fan,et al.  High dimensional covariance matrix estimation using a factor model , 2007, math/0701124.

[15]  D. Paul ASYMPTOTICS OF SAMPLE EIGENSTRUCTURE FOR A LARGE DIMENSIONAL SPIKED COVARIANCE MODEL , 2007 .

[16]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[17]  Noureddine El Karoui,et al.  Operator norm consistent estimation of large-dimensional sparse covariance matrices , 2008, 0901.3220.

[18]  P. Bickel,et al.  Regularized estimation of large covariance matrices , 2008, 0803.1909.

[19]  Bin Yu,et al.  High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence , 2008, 0811.3628.

[20]  B. Nadler,et al.  Determining the number of components in a factor model from limited noisy data , 2008 .

[21]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[22]  I. Johnstone,et al.  On Consistency and Sparsity for Principal Components Analysis in High Dimensions , 2009, Journal of the American Statistical Association.

[23]  J. Marron,et al.  PCA CONSISTENCY IN HIGH DIMENSION, LOW SAMPLE SIZE CONTEXT , 2009, 0911.3827.

[24]  Jianqing Fan,et al.  Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation. , 2007, Annals of statistics.

[25]  Boaz Nadler,et al.  Non-Parametric Detection of the Number of Signals: Hypothesis Testing and Random Matrix Theory , 2009, IEEE Transactions on Signal Processing.

[26]  P. Bickel,et al.  Covariance regularization by thresholding , 2009, 0901.3079.

[27]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[28]  Harrison H. Zhou,et al.  Optimal rates of convergence for covariance matrix estimation , 2010, 1010.3866.

[29]  L. Mattner,et al.  Stochastic ordering of classical discrete distributions , 2009, Advances in Applied Probability.

[30]  V. Koltchinskii,et al.  Nuclear norm penalization and optimal rates for noisy low rank matrix completion , 2010, 1011.6256.

[31]  A. Tsybakov,et al.  Estimation of high-dimensional low-rank matrices , 2009, 0912.5338.

[32]  Ming Yuan,et al.  High Dimensional Inverse Covariance Matrix Estimation via Linear Programming , 2010, J. Mach. Learn. Res..

[33]  Martin J. Wainwright,et al.  Estimation of (near) low-rank matrices with noise and high-dimensional scaling , 2009, ICML.

[34]  S. Geer,et al.  Oracle Inequalities and Optimal Inference under Group Sparsity , 2010, 1007.1771.

[35]  Weidong Liu,et al.  Adaptive Thresholding for Sparse Covariance Matrix Estimation , 2011, 1102.2237.

[36]  Igor Vajda,et al.  On Pairs of $f$ -Divergences and Their Joint Range , 2010, IEEE Transactions on Information Theory.

[37]  F. Bunea,et al.  On the sample covariance matrix estimator of reduced effective rank population matrices, with applications to fPCA , 2012, 1212.5321.

[38]  M. Yuan,et al.  Adaptive covariance matrix estimation through block thresholding , 2012, 1211.0459.

[39]  Alexei Onatski,et al.  Signal detection in high dimension: The multispiked case , 2012, 1210.5663.

[40]  Harrison H. Zhou,et al.  OPTIMAL RATES OF CONVERGENCE FOR SPARSE COVARIANCE MATRIX ESTIMATION , 2012, 1302.3030.

[41]  Jing Lei,et al.  Minimax Rates of Estimation for Sparse PCA in High Dimensions , 2012, AISTATS.

[42]  Harrison H. Zhou,et al.  Estimating Sparse Precision Matrix: Optimal Rates of Convergence and Adaptive Estimation , 2012, 1212.2882.

[43]  Karim Lounici High-dimensional covariance matrix estimation with missing observations , 2012, 1201.2577.

[44]  P. Rigollet,et al.  Optimal detection of sparse principal components in high dimension , 2012, 1202.5070.

[45]  Alexei Onatski,et al.  Asymptotics of the principal components estimator of large factor models with weakly influential factors , 2012 .

[46]  T. Cai,et al.  Sparse PCA: Optimal rates and adaptive estimation , 2012, 1211.1309.

[47]  Vincent Q. Vu,et al.  MINIMAX SPARSE PRINCIPAL SUBSPACE ESTIMATION IN HIGH DIMENSIONS , 2012, 1211.0373.

[48]  Zongming Ma Sparse Principal Component Analysis and Iterative Thresholding , 2011, 1112.2432.

[49]  Philippe Rigollet,et al.  Complexity Theoretic Lower Bounds for Sparse Principal Component Detection , 2013, COLT.

[50]  Harrison H. Zhou,et al.  Optimal rates of convergence for estimating Toeplitz covariance matrices , 2013 .

[51]  B. Nadler,et al.  MINIMAX BOUNDS FOR SPARSE PCA WITH NOISY HIGH-DIMENSIONAL DATA. , 2012, Annals of statistics.

[52]  Karim Lounici Sparse Principal Component Analysis with Missing Observations , 2012, 1205.7060.