BAYESIAN LINEAR REGRESSION WITH SPARSE PRIORS

We study full Bayesian procedures for high-dimensional linear regression under sparsity constraints. The prior is a mixture of point masses at zero and continuous distributions. Under compatibility conditions on the design matrix, the posterior distribution is shown to contract at the optimal rate for recovery of the unknown sparse vector, and to give optimal prediction of the response vector. It is also shown to select the correct sparse model, or at least the coefficients that are significantly different from zero. The asymptotic shape of the posterior distribution is characterized and employed to the construction and study of credible sets for uncertainty quantification.

[1]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[2]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[3]  Dean Phillips Foster,et al.  Calibration and empirical Bayes variable selection , 2000 .

[4]  E. George The Variable Selection Problem , 2000 .

[5]  A. V. D. Vaart,et al.  Convergence rates of posterior distributions , 2000 .

[6]  Petros Dellaportas,et al.  On Bayesian model and variable selection using MCMC , 2002, Stat. Comput..

[7]  I. Johnstone,et al.  Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences , 2004, math/0410088.

[8]  I. Johnstone,et al.  Adapting to unknown sparsity by controlling the false discovery rate , 2005, math/0505374.

[9]  J. S. Rao,et al.  Spike and slab variable selection: Frequentist and Bayesian strategies , 2005, math/0505633.

[10]  M. Yuan,et al.  Efficient Empirical Bayes Variable Selection and Estimation in Linear Models , 2005 .

[11]  Michael Elad,et al.  Stable recovery of sparse overcomplete representations in the presence of noise , 2006, IEEE Transactions on Information Theory.

[12]  Andrew R. Barron,et al.  Information Theory and Mixing Least-Squares Regressions , 2006, IEEE Transactions on Information Theory.

[13]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[14]  Arnak S. Dalalyan,et al.  Aggregation by Exponential Weighting and Sharp Oracle Inequalities , 2007, COLT.

[15]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[16]  A. Tsybakov,et al.  Sparsity oracle inequalities for the Lasso , 2007, 0705.3308.

[17]  M. West,et al.  Shotgun Stochastic Search for “Large p” Regression , 2007 .

[18]  Karim Lounici Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators , 2008, 0801.4610.

[19]  I. Castillo Lower bounds for posterior rates with Gaussian process priors , 2008, 0807.2734.

[20]  Cun-Hui Zhang,et al.  The sparsity and bias of the Lasso selection in high-dimensional linear regression , 2008, 0808.0967.

[21]  Chris Hans Bayesian lasso regression , 2009 .

[22]  S. Geer,et al.  On the conditions used to prove oracle results for the Lasso , 2009, 0910.0722.

[23]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[24]  Felix Abramovich,et al.  MAP model selection in Gaussian regression , 2009, 0912.4387.

[25]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[26]  V. Koltchinskii,et al.  Nuclear norm penalization and optimal rates for noisy low rank matrix completion , 2010, 1011.6256.

[27]  James G. Scott,et al.  Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem , 2010, 1011.2333.

[28]  N. Zhang,et al.  Bayesian Variable Selection in Structured High-Dimensional Covariate Spaces With Applications in Genomics , 2010 .

[29]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[30]  A. Tsybakov,et al.  Exponential Screening and optimal rates of sparse estimation , 2010, 1003.2654.

[31]  J. Griffin,et al.  Inference with normal-gamma prior distributions in regression problems , 2010 .

[32]  Sylvia Richardson,et al.  Evolutionary Stochastic Search for Bayesian model exploration , 2010, 1002.2706.

[33]  Oracle convergence rate of posterior under projection prior and Bayesian model selection , 2010 .

[34]  Marc Chadeau-Hyam,et al.  ESS++: a C++ objected-oriented algorithm for Bayesian stochastic search model exploration , 2011, Bioinform..

[35]  Marina Vannucci,et al.  Variable selection for discriminant analysis with Markov random field priors for the analysis of microarray data , 2011, Bioinform..

[36]  T. Cai,et al.  Limiting laws of coherence of random matrices with applications to testing covariance structure and construction of compressed sensing matrices , 2011, 1102.2925.

[37]  Sara van de Geer,et al.  Statistics for High-Dimensional Data , 2011 .

[38]  D. Dunson,et al.  Bayesian Variable Selection via Particle Stochastic Search. , 2011, Statistics & probability letters.

[39]  Sara van de Geer,et al.  Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .

[40]  A. V. D. Vaart,et al.  Needles and Straw in a Haystack: Posterior concentration for possibly sparse sequences , 2012, 1211.1197.

[41]  S. Richardson,et al.  Bayesian Models for Sparse Regression Analysis of High Dimensional Data , 2012 .

[42]  A. Tsybakov,et al.  Sparse Estimation by Exponential Weighting , 2011, 1108.5116.

[43]  S. Geer,et al.  Correlated variables in regression: Clustering and sparse estimation , 2012, 1209.5908.

[44]  Nicolas Chopin,et al.  Sequential Monte Carlo on large binary sampling spaces , 2011, Statistics and Computing.

[45]  S. Schmidler,et al.  Adaptive Markov Chain Monte Carlo for Bayesian Variable Selection , 2013 .

[46]  J. Ormerod,et al.  On Variational Bayes Estimation and Variational Information Criteria for Linear Regression Models , 2014 .

[47]  Karim Lounici,et al.  Estimation and variable selection with exponential weights , 2014 .

[48]  Veronika Rockova,et al.  EMVS: The EM Approach to Bayesian Variable Selection , 2014 .

[49]  S. Geer,et al.  On higher order isotropy conditions and lower bounds for sparse quadratic forms , 2014, 1405.5995.

[50]  Judith Rousseau,et al.  On adaptive posterior concentration rates , 2013, 1305.5270.

[51]  Gersende Fort,et al.  A Shrinkage-Thresholding Metropolis Adjusted Langevin Algorithm for Bayesian Variable Selection , 2013, IEEE Journal of Selected Topics in Signal Processing.

[52]  Stephen G. Walker,et al.  Empirical Bayes posterior concentration in sparse high-dimensional linear models , 2014, 1406.7718.