High-dimensional Covariance Estimation Based On Gaussian Graphical Models

Undirected graphs are often used to describe high dimensional distributions. Under sparsity conditions, the graph can be estimated using l1-penalization methods. We propose and study the following method. We combine a multiple regression approach with ideas of thresholding and refitting: first we infer a sparse undirected graphical model structure via thresholding of each among many l1-norm penalized regression functions; we then estimate the covariance matrix and its inverse using the maximum likelihood estimator. We show that under suitable conditions, this approach yields consistent estimation in terms of graphical structure and fast convergence rates with respect to the operator and Frobenius norm for the covariance matrix and its inverse. We also derive an explicit bound for the Kullback Leibler divergence.

[1]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[2]  P. Bickel,et al.  Regularized estimation of large covariance matrices , 2008, 0803.1909.

[3]  R. Adamczak,et al.  Restricted Isometry Property of Matrices with Independent Columns and Neighborly Polytopes by Random Sampling , 2009, 0904.4723.

[4]  Jianhua Z. Huang,et al.  Covariance matrix selection and estimation via penalised normal likelihood , 2006 .

[5]  Nicolai Meinshausen,et al.  Relaxed Lasso , 2007, Comput. Stat. Data Anal..

[6]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[7]  T. Bengtsson,et al.  Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants , 2007 .

[8]  H. Zou,et al.  One-step Sparse Estimates in Nonconcave Penalized Likelihood Models. , 2008, Annals of statistics.

[9]  Shuheng Zhou Restricted Eigenvalue Conditions on Subgaussian Random Matrices , 2009, 0912.4045.

[10]  Bin Yu,et al.  High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence , 2008, 0811.3628.

[11]  T. Richardson,et al.  Estimation of a covariance matrix with zeros , 2005, math/0508268.

[12]  Cun-Hui Zhang Discussion: One-step sparse estimates in nonconcave penalized likelihood models , 2008, 0808.1025.

[13]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[14]  Pei Wang,et al.  Partial Correlation Estimation by Joint Sparse Regression Models , 2008, Journal of the American Statistical Association.

[15]  Peter Buhlmann,et al.  High dimensional sparse covariance estimation via directed acyclic graphs , 2009, 0911.2375.

[16]  Shuheng Zhou Thresholded Lasso for high dimensional variable selection and statistical estimation , 2010, 1002.1583.

[17]  R. Spang,et al.  Predicting the clinical status of human breast cancer by using gene expression profiles , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[18]  S. Geer,et al.  Adaptive Lasso for High Dimensional Regression and Gaussian Graphical Modeling , 2009, 0903.2515.

[19]  I. Johnstone Chi-square oracle inequalities , 2000 .

[20]  N. Meinshausen,et al.  LASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA , 2008, 0806.0145.

[21]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[22]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[23]  Caroline Uhler,et al.  Geometry of maximum likelihood estimation in Gaussian graphical models , 2010, 1012.2643.

[24]  P. Bühlmann,et al.  Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana , 2004, Genome Biology.

[25]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[26]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[27]  Alexandre d'Aspremont,et al.  First-Order Methods for Sparse Covariance Selection , 2006, SIAM J. Matrix Anal. Appl..

[28]  Shuheng Zhou,et al.  25th Annual Conference on Learning Theory Reconstruction from Anisotropic Random Measurements , 2022 .

[29]  Nicolas Verzelen,et al.  Adaptive estimation of covariance matrices via Cholesky decomposition , 2010, 1010.1445.

[30]  Larry A. Wasserman,et al.  Time varying undirected graphs , 2008, Machine Learning.

[31]  N. Meinshausen A note on the Lasso for Gaussian graphical model selection , 2008 .

[32]  M. Wainwright,et al.  HIGH-DIMENSIONAL COVARIANCE ESTIMATION BY MINIMIZING l1-PENALIZED LOG-DETERMINANT DIVERGENCE BY PRADEEP RAVIKUMAR , 2009 .

[33]  Shuheng Zhou,et al.  Thresholding Procedures for High Dimensional Variable Selection and Statistical Estimation , 2009, NIPS.

[34]  Adam J. Rothman,et al.  Sparse estimation of large covariance matrices via a nested Lasso penalty , 2008, 0803.3872.

[35]  Adam J. Rothman,et al.  Sparse permutation invariant covariance estimation , 2008, 0801.4837.

[36]  Xiao-Li Meng,et al.  Discussion: One-step sparse estimates in nonconcave penalized likelihood models: Who cares if it is a white cat or a black cat? , 2008, 0808.1016.

[37]  T. Cai,et al.  A Constrained ℓ1 Minimization Approach to Sparse Precision Matrix Estimation , 2011, 1102.2233.

[38]  Jianqing Fan,et al.  Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation. , 2007, Annals of statistics.

[39]  P. Bickel,et al.  Some theory for Fisher''s linear discriminant function , 2004 .

[40]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[41]  Jianqing Fan,et al.  NETWORK EXPLORATION VIA THE ADAPTIVE LASSO AND SCAD PENALTIES. , 2009, The annals of applied statistics.

[42]  S. Geer,et al.  The adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso) , 2011 .

[43]  Ming Yuan,et al.  High Dimensional Inverse Covariance Matrix Estimation via Linear Programming , 2010, J. Mach. Learn. Res..

[44]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[45]  Peter Bühlmann,et al.  High dimensional sparse covariance estimation via directed acyclic graphs , 2009 .

[46]  M. Pourahmadi,et al.  Nonparametric estimation of large covariance matrices of longitudinal data , 2003 .