Posterior contraction in sparse Bayesian factor models for massive covariance matrices

Sparse Bayesian factor models are routinely implemented for parsimonious dependence modeling and dimensionality reduction in high-dimensional applications. We provide theoretical understanding of such Bayesian procedures in terms of posterior convergence rates in inferring high-dimensional covariance matrices where the dimension can be larger than the sample size. Under relevant sparsity assumptions on the true covariance matrix, we show that commonly-used point mass mixture priors on the factor loadings lead to consistent estimation in the operator norm even when $p\gg n$. One of our major contributions is to develop a new class of continuous shrinkage priors and provide insights into their concentration around sparse vectors. Using such priors for the factor loadings, we obtain similar rate of convergence as obtained with point mass mixture priors. To obtain the convergence rates, we construct test functions to separate points in the space of high-dimensional covariance matrices using insights from random matrix theory; the tools developed may be of independent interest. We also derive minimax rates and show that the Bayesian posterior rates of convergence coincide with the minimax rates upto a $\sqrt{\log n}$ term.

[1]  Noureddine El Karoui,et al.  Operator norm consistent estimation of large-dimensional sparse covariance matrices , 2008, 0901.3220.

[2]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[3]  S Y Lee,et al.  Bayesian estimation and test for factor analysis model with continuous and polytomous data in several populations. , 2001, The British journal of mathematical and statistical psychology.

[4]  E. Fama,et al.  The Cross‐Section of Expected Stock Returns , 1992 .

[5]  Horst Alzer,et al.  On some inequalities for the incomplete gamma function , 1997, Math. Comput..

[6]  Dominique Bontemps,et al.  Bernstein von Mises Theorems for Gaussian Regression with increasing number of regressors , 2010, 1009.1370.

[7]  A. V. D. Vaart,et al.  Adaptive Bayesian density estimation with location-scale mixtures , 2010 .

[8]  A. W. Vaart,et al.  Reproducing kernel Hilbert spaces of Gaussian priors , 2008, 0805.3252.

[9]  Wenxin Jiang Bayesian variable selection for high dimensional generalized linear models : Convergence rates of the fitted densities , 2007, 0710.3458.

[10]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[11]  F. Dias,et al.  Determining the number of factors in approximate factor models with global and group-specific factors , 2008 .

[12]  Harrison H. Zhou,et al.  Optimal rates of convergence for covariance matrix estimation , 2010, 1010.3866.

[13]  Weidong Liu,et al.  Adaptive Thresholding for Sparse Covariance Matrix Estimation , 2011, 1102.2237.

[14]  J. Bai,et al.  Inferential Theory for Factor Models of Large Dimensions , 2003 .

[15]  Matthew West,et al.  Bayesian factor regression models in the''large p , 2003 .

[16]  E. Fama,et al.  Common risk factors in the returns on stocks and bonds , 1993 .

[17]  James G. Scott,et al.  Shrink Globally, Act Locally: Sparse Bayesian Regularization and Prediction , 2022 .

[18]  L. Mirsky A trace inequality of John von Neumann , 1975 .

[19]  P. Bickel,et al.  Regularized estimation of large covariance matrices , 2008, 0803.1909.

[20]  Lawrence Carin,et al.  Negative Binomial Process Count and Mixture Modeling. , 2012, IEEE transactions on pattern analysis and machine intelligence.

[21]  Torben Hagerup,et al.  A Guided Tour of Chernoff Bounds , 1990, Inf. Process. Lett..

[22]  Bin Yu Assouad, Fano, and Le Cam , 1997 .

[23]  I. Johnstone,et al.  On Consistency and Sparsity for Principal Components Analysis in High Dimensions , 2009, Journal of the American Statistical Association.

[24]  Olivier Ledoit,et al.  Improved estimation of the covariance matrix of stock returns with an application to portfolio selection , 2003 .

[25]  Adam J. Rothman,et al.  Sparse permutation invariant covariance estimation , 2008, 0801.4837.

[26]  M. Pourahmadi,et al.  BANDING SAMPLE AUTOCOVARIANCE MATRICES OF STATIONARY PROCESSES , 2009 .

[27]  F. Bunea,et al.  On the sample covariance matrix estimator of reduced effective rank population matrices, with applications to fPCA , 2012, 1212.5321.

[28]  Jianqing Fan,et al.  Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation. , 2007, Annals of statistics.

[29]  Jaeyong Lee,et al.  GENERALIZED DOUBLE PARETO SHRINKAGE. , 2011, Statistica Sinica.

[30]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[31]  Kolyan Ray,et al.  Bayesian inverse problems with non-conjugate priors , 2012, 1209.6156.

[32]  L. L. Cam,et al.  Asymptotic Methods In Statistical Decision Theory , 1986 .

[33]  P. Bickel,et al.  Covariance regularization by thresholding , 2009, 0901.3079.

[34]  S. Ross The arbitrage theory of capital asset pricing , 1976 .

[35]  Adam J. Rothman,et al.  Sparse estimation of large covariance matrices via a nested Lasso penalty , 2008, 0803.3872.

[36]  Jianqing Fan,et al.  High dimensional covariance matrix estimation using a factor model , 2007, math/0701124.

[37]  Joel A. Tropp,et al.  User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[38]  S. Ghosal Asymptotic Normality of Posterior Distributions for Exponential Families when the Number of Parameters Tends to Infinity , 2000 .

[39]  Richard Nickl,et al.  Rates of contraction for posterior distributions in Lr-metrics, 1 ≤ r ≤ ∞ , 2011, 1203.2043.

[40]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[41]  E. Belitser,et al.  Adaptive Bayesian inference on the mean of an infinite-dimensional normal distribution , 2003 .

[42]  Jianqing Fan,et al.  High Dimensional Covariance Matrix Estimation in Approximate Factor Models , 2011, Annals of statistics.

[43]  Stan Lipovetsky,et al.  Latent Variable Models and Factor Analysis , 2001, Technometrics.

[44]  Clifford Lam,et al.  Factor modeling for high-dimensional time series: inference for the number of factors , 2012, 1206.0613.

[45]  S. Ross THE CAPITAL ASSET PRICING MODEL (CAPM), SHORT‐SALE RESTRICTIONS AND RELATED ISSUES , 1977 .

[46]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[47]  T. Bengtsson,et al.  Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants , 2007 .

[48]  Carlos M. Carvalho,et al.  Sparse Statistical Modelling in Gene Expression Genomics , 2006 .

[49]  Subhashis Ghosal,et al.  Asymptotic normality of posterior distributions in high-dimensional linear models , 1999 .

[50]  M. Pourahmadi,et al.  Nonparametric estimation of large covariance matrices of longitudinal data , 2003 .

[51]  A. V. D. Vaart,et al.  Convergence rates of posterior distributions , 2000 .

[52]  Chris Hans Elastic Net Regression Modeling With the Orthant Normal Prior , 2011 .

[53]  James G. Scott,et al.  Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem , 2010, 1011.2333.

[54]  S. Geman A Limit Theorem for the Norm of Random Matrices , 1980 .

[55]  Robb J. Muirhead,et al.  Developments in Eigenvalue Estimation , 1987 .

[56]  Harrison H. Zhou,et al.  OPTIMAL RATES OF CONVERGENCE FOR SPARSE COVARIANCE MATRIX ESTIMATION , 2012, 1302.3030.

[57]  B. Muthén,et al.  A Bayesian approach to nonlinear latent variable models using the Gibbs sampler and the metropolis-hastings algorithm , 1998 .

[58]  A. V. D. Vaart,et al.  Needles and Straw in a Haystack: Posterior concentration for possibly sparse sequences , 2012, 1211.1197.

[59]  I. Johnstone Chi-square oracle inequalities , 2000 .

[60]  M. West,et al.  High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics , 2008, Journal of the American Statistical Association.

[61]  Jianqing Fan,et al.  Large covariance estimation by thresholding principal orthogonal complements , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[62]  Jianhua Z. Huang,et al.  Covariance matrix selection and estimation via penalised normal likelihood , 2006 .