On Posterior Consistency of Bayesian Factor Models in High Dimensions

As a principled dimension reduction technique, factor models have been widely adopted in social science, economics, bioinformatics, and many other fields. However, in high-dimensional settings, conducting a 'correct' Bayesianfactor analysis can be subtle since it requires both a careful prescription of the prior distribution and a suitable computational strategy. In particular, we analyze the issues related to the attempt of being "noninformative" for elements of the factor loading matrix, especially for sparse Bayesian factor models in high dimensions, and propose solutions to them. We show here why adopting the orthogonal factor assumption is appropriate and can result in a consistent posterior inference of the loading matrix conditional on the true idiosyncratic variance and the allocation of nonzero elements in the true loading matrix. We also provide an efficient Gibbs sampler to conduct the full posterior inference based on the prior setup from Rockova and George (2016)and a uniform orthogonal factor assumption on the factor matrix.

[1]  Shizhong Xu,et al.  Mapping Quantitative Trait Loci for Expression Abundance , 2007, Genetics.

[2]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Debdeep Pati,et al.  Posterior contraction in sparse Bayesian factor models for massive covariance matrices , 2012, 1206.3627.

[4]  A. U.S.,et al.  Generalised Gibbs sampler and multigrid Monte Carlo for Bayesian computation , 2000 .

[5]  M. West,et al.  High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics , 2008, Journal of the American Statistical Association.

[6]  M. Stone,et al.  Marginalization Paradoxes in Bayesian and Structural Inference , 1973 .

[7]  Jun S. Liu,et al.  Parameter Expansion for Data Augmentation , 1999 .

[8]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[9]  Elizabeth Meckes,et al.  Concentration of Measure and the Compact Classical Matrix Groups , 2014 .

[10]  Tim Hesterberg,et al.  Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.

[11]  Carey E. Priebe,et al.  Bayesian Estimation of Sparse Spiked Covariance Matrices in High Dimensions , 2018, 1808.07433.

[12]  Matthew West,et al.  Bayesian factor regression models in the''large p , 2003 .

[13]  David B Dunson,et al.  Default Prior Distributions and Efficient Posterior Computation in Bayesian Factor Analysis , 2009, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[14]  Anru R. Zhang,et al.  Rate-Optimal Perturbation Bounds for Singular Subspaces with Applications to High-Dimensional Statistics , 2016, 1605.00353.

[15]  A. Owen,et al.  AGEMAP: A Gene Expression Database for Aging in Mice , 2007, PLoS genetics.

[16]  E. George,et al.  Fast Bayesian Factor Analysis via Automatic Rotations to Sparsity , 2016 .

[17]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[18]  H. Lopes,et al.  Sparse Bayesian Factor Analysis When the Number of Factors Is Unknown , 2018, Bayesian Analysis.

[19]  H. Kaiser The varimax criterion for analytic rotation in factor analysis , 1958 .

[20]  Ranjini Natarajan,et al.  Gibbs Sampling with Diffuse Proper Priors: A Valid Approach to Data-Driven Inference? , 1998 .

[21]  D. Dunson,et al.  Sparse Bayesian infinite factor models. , 2011, Biometrika.

[22]  L. Wasserman,et al.  The Selection of Prior Distributions by Formal Rules , 1996 .