Bayesian Adaptive Sampling for Variable Selection and Model Averaging

For the problem of model choice in linear regression, we introduce a Bayesian adaptive sampling algorithm (BAS), that samples models without replacement from the space of models. For problems that permit enumeration of all models, BAS is guaranteed to enumerate the model space in 2p iterations where p is the number of potential variables under consideration. For larger problems where sampling is required, we provide conditions under which BAS provides perfect samples without replacement. When the sampling probabilities in the algorithm are the marginal variable inclusion probabilities, BAS may be viewed as sampling models “near” the median probability model of Barbieri and Berger. As marginal inclusion probabilities are not known in advance, we discuss several strategies to estimate adaptively the marginal inclusion probabilities within BAS. We illustrate the performance of the algorithm using simulated and real data and show that BAS can outperform Markov chain Monte Carlo methods. The algorithm is implemented in the R package BAS available at CRAN. This article has supplementary material online.

[1]  E. George,et al.  APPROACHES FOR BAYESIAN VARIABLE SELECTION , 1997 .

[2]  D. Madigan,et al.  Bayesian Model Averaging for Linear Regression Models , 1997 .

[3]  M. Clyde,et al.  Prediction via Orthogonalized Model Mixing , 1996 .

[4]  R. Kohn,et al.  Nonparametric regression using Bayesian variable selection , 1996 .

[5]  M. Ruiz Espejo Sampling , 2013, Encyclopedic Dictionary of Archaeology.

[6]  Adrian E. Raftery,et al.  Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors , 1999 .

[7]  Adrian F. M. Smith,et al.  Automatic Bayesian curve fitting , 1998 .

[8]  J. York,et al.  Bayesian Graphical Models for Discrete Data , 1995 .

[9]  Scott C Schmidler,et al.  BAYESIAN MODEL SEARCH AND MULTILEVEL INFERENCE FOR SNP ASSOCIATION STUDIES. , 2009, The annals of applied statistics.

[10]  J. Berger,et al.  Optimal predictive model selection , 2004, math/0406464.

[11]  R. Fisher,et al.  On the Mathematical Foundations of Theoretical Statistics , 1922 .

[12]  A. P. Dawid,et al.  Bayesian Model Averaging and Model Search Strategies , 2007 .

[13]  B. D. Finetti,et al.  Bayesian inference and decision techniques : essays in honor of Bruno de Finetti , 1986 .

[14]  David J. Nott,et al.  Adaptive sampling for Bayesian variable selection , 2005 .

[15]  A. Zellner,et al.  Posterior odds ratios for selected regression hypotheses , 1980 .

[16]  J. Bernardo Bayesian statistics 6 : proceedings of the Sixth Valencia International Meeting, June 6-10, 1998 , 1999 .

[17]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[18]  Faming Liang,et al.  EVOLUTIONARY MONTE CARLO: APPLICATIONS TO Cp MODEL SAMPLING AND CHANGE POINT PROBLEM , 2000 .

[19]  G Parmigiani,et al.  Protein construct storage: Bayesian variable selection and prediction with mixtures. , 1998, Journal of biopharmaceutical statistics.

[20]  M. J. Bayarri,et al.  Calibration of ρ Values for Testing Precise Null Hypotheses , 2001 .

[21]  P. Green,et al.  Bayesian Variable Selection and the Swendsen-Wang Algorithm , 2004 .

[22]  M. Clyde,et al.  Model Uncertainty , 2003 .

[23]  Joyee Ghosh,et al.  A Note on the Bias in Estimating Posterior Probabilities in Variable Selection , 2010 .

[24]  M. Clyde,et al.  Mixtures of g Priors for Bayesian Variable Selection , 2008 .

[25]  C. Lawrence,et al.  Centroid estimation in discrete high-dimensional spaces with applications in biology , 2008, Proceedings of the National Academy of Sciences.

[26]  S. Q. s3idChMn,et al.  Evolutionary Monte Carlo: Applications to C_p Model Sampling and Change Point Problem , 2000 .

[27]  Robert W. Wilson,et al.  Regressions by Leaps and Bounds , 2000, Technometrics.