Consistency of spike and slab regression

Spike and slab models are a popular and attractive variable selection approach in regression settings. Applications for these models have blossomed over the last decade and they are increasingly being used in challenging problems. At the same time, theory for spike and slab models has not kept pace with the applications. There are many gaps in what we know about their theoretical properties. An important property known to hold in these models is selective shrinkage: a unique property whereby the posterior mean is shrunk toward zero for non-informative variables only. This property has been shown to hold under orthogonality for continuous priors under the modified class of rescaled spike and slab models. In this paper, we extend this result to the general case and prove an oracle property for the posterior mean under a discrete two-component prior. An immediate consequence is that a strong selective shrinkage property holds. Interestingly, the conditions needed for our result to hold in the non-orthogonal setting are more stringent than in the orthogonal case and amount to a type of enforced sparsity condition that must be met by the prior.

[1]  M. Clyde,et al.  Prediction via Orthogonalized Model Mixing , 1996 .

[2]  J. S. Rao,et al.  Spike and slab variable selection: Frequentist and Bayesian strategies , 2005, math/0505633.

[3]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[4]  M. Srivastava On Fixed-Width Confidence Bounds for Regression Parameters and Mean Vector , 1967 .

[5]  John Geweke,et al.  Estimating regression models of finite but unknown order , 1981 .

[6]  Pascal J. Goldschmidt-Clermont,et al.  Of mice and men: Sparse statistical modeling in cardiovascular genomics , 2007, 0709.0165.

[7]  J. S. Rao,et al.  Detecting Differentially Expressed Genes in Microarrays Using Bayesian Model Selection , 2003 .

[8]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[9]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[10]  T. Fearn,et al.  Multivariate Bayesian variable selection and prediction , 1998 .

[11]  Hugh Chipman,et al.  Bayesian variable selection with related predictors , 1995, bayes-an/9510001.

[12]  John Geweke,et al.  Estimating regression models of finite but unknown order , 1981 .

[13]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[14]  J. S. Rao,et al.  Spike and Slab Gene Selection for Multigroup Microarray Data , 2005 .

[15]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[16]  K. Gaver,et al.  Posterior probabilities of alternative linear models , 1971 .