Estimation and optimal structure selection of high-dimensional Toeplitz covariance matrix

Abstract The estimation of structured covariance matrix arises in many fields. An appropriate covariance structure not only improves the accuracy of covariance estimation but also increases the efficiency of mean parameter estimators in statistical models. In this paper, a novel statistical method is proposed, which selects the optimal Toeplitz covariance structure and estimates the covariance matrix, simultaneously. An entropy loss function with nonconvex penalty is employed as a matrix-discrepancy measure, under which the optimal selection of sparse or nearly sparse Toeplitz structure and the parameter estimators of covariance matrix are made, simultaneously, through its minimization. The cases of both low-dimensional ( p ≤ n ) and high-dimensional ( p > n ) covariance matrix estimation are considered. The resulting Toeplitz structured covariance estimators are guaranteed to be positive definite and consistent. Asymptotic properties are investigated and simulation studies are conducted, showing that very high accurate Toeplitz covariance structure estimation is made. The proposed method is then applied to practical data analysis, which demonstrates its good performance in covariance estimation in practice.

[1]  Jianhua Z. Huang,et al.  Covariance matrix selection and estimation via penalised normal likelihood , 2006 .

[2]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[3]  Adam J. Rothman Positive definite estimators of large covariance matrices , 2012 .

[4]  Seymour V. Parter An observation on the numerical solution of difference equations and a theorem of Szegö , 1962 .

[5]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[6]  Jianqing Fan,et al.  Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation. , 2007, Annals of statistics.

[7]  D. Hunter,et al.  Variable Selection using MM Algorithms. , 2005, Annals of statistics.

[8]  Harrison H. Zhou,et al.  Optimal rates of convergence for covariance matrix estimation , 2010, 1010.3866.

[9]  P. Bickel,et al.  Regularized estimation of large covariance matrices , 2008, 0803.1909.

[10]  Jianqing Fan,et al.  Large covariance estimation by thresholding principal orthogonal complements , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[11]  Tso-Jung Yen,et al.  Discussion on "Stability Selection" by Meinshausen and Buhlmann , 2010 .

[12]  Harrison H. Zhou,et al.  Optimal rates of convergence for estimating Toeplitz covariance matrices , 2013 .

[13]  P. Bickel,et al.  Covariance regularization by thresholding , 2009, 0901.3079.

[14]  Adam J. Rothman,et al.  Sparse estimation of large covariance matrices via a nested Lasso penalty , 2008, 0803.3872.

[15]  Jinchi Lv,et al.  A unified approach to model selection and sparse recovery using regularized least squares , 2009, 0905.3573.

[16]  Weidong Liu,et al.  Adaptive Thresholding for Sparse Covariance Matrix Estimation , 2011, 1102.2237.

[17]  Peter J. Diggle,et al.  Model-based Geostatistics for Global Public Health , 2019 .

[18]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[19]  Shurong Zheng,et al.  Substitution principle for CLT of linear spectral statistics of high-dimensional sample covariance matrices with applications to hypothesis testing , 2014, 1404.6633.

[20]  Bin Yu,et al.  High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence , 2008, 0811.3628.

[21]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[22]  R. J. Alcock,et al.  Time-Series Similarity Queries Employing a Feature-Based Approach , 1999 .

[23]  Chao Huang,et al.  A calibration method for non-positive definite covariance matrix in multivariate data analysis , 2017, J. Multivar. Anal..

[24]  H. Zou,et al.  Positive Definite $\ell_1$ Penalized Estimation of Large Covariance Matrices , 2012, 1208.5702.

[25]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[26]  Nicholas J. Higham,et al.  Computational Statistics and Data Analysis Covariance Structure Regularization via Entropy Loss Function , 2022 .

[27]  Runze Li,et al.  HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX. , 2019, Annals of statistics.

[28]  Adam J. Rothman,et al.  Generalized Thresholding of Large Covariance Matrices , 2009 .

[29]  Philippe Forster,et al.  Covariance Structure Maximum-Likelihood Estimates in Compound Gaussian Noise: Existence and Algorithm Analysis , 2008, IEEE Transactions on Signal Processing.

[30]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[31]  R. Tibshirani,et al.  Sparse estimation of a covariance matrix. , 2011, Biometrika.

[32]  K. Filipiak,et al.  On Projection of a Positive Definite Matrix on a Cone of Nonnegative Definite Toeplitz Matrices , 2018 .

[33]  Jianxin Pan,et al.  Growth curve models and statistical diagnostics , 2002 .

[34]  Trevor Hastie,et al.  Statistical Learning with Sparsity: The Lasso and Generalizations , 2015 .

[35]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[36]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[37]  Chih-Ling Tsai,et al.  Tests for covariance structures with high-dimensional repeated measurements , 2017 .

[38]  Lan Wang,et al.  GEE analysis of clustered binary data with diverging number of covariates , 2011, 1103.1795.

[39]  Defei Zhang,et al.  Covariance structure regularization via Frobenius-norm discrepancy , 2016 .

[40]  H. Zou,et al.  One-step Sparse Estimates in Nonconcave Penalized Likelihood Models. , 2008, Annals of statistics.

[41]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .