Bayes model averaging with selection of regressors

Summary. When a number of distinct models contend for use in prediction, the choice of a single model can offer rather unstable predictions. In regression, stochastic search variable selection with Bayesian model averaging offers a cure for this robustness issue but at the expense of requiring very many predictors. Here we look at Bayes model averaging incorporating variable selection for prediction. This offers similar mean‐square errors of prediction but with a vastly reduced predictor space. This can greatly aid the interpretation of the model. It also reduces the cost if measured variables have costs. The development here uses decision theory in the context of the multivariate general linear model. In passing, this reduced predictor space Bayes model averaging is contrasted with single‐model approximations. A fast algorithm for updating regressions in the Markov chain Monte Carlo searches for posterior inference is developed, allowing many more variables than observations to be contemplated. We discuss the merits of absolute rather than proportionate shrinkage in regression, especially when there are more variables than observations. The methodology is illustrated on a set of spectroscopic data used for measuring the amounts of different sugars in an aqueous solution.

[1]  D. Lindley The Choice of Variables in Multiple Regression , 1968 .

[2]  A. E. Hoerl,et al.  Ridge Regression: Applications to Nonorthogonal Problems , 1970 .

[3]  A. F. Smith A General Bayesian Linear Model , 1973 .

[4]  Edward E. Leamer,et al.  A Bayesian Interpretation of Pretesting , 1976 .

[5]  J. Zidek,et al.  Adaptive Multivariate Ridge Regression , 1980 .

[6]  A. Dawid Some matrix-variate distribution theory: Notational considerations and a Bayesian application , 1981 .

[7]  J. Copas Regression, Prediction and Shrinkage , 1983 .

[8]  Fulvio Spezzaferri,et al.  A Predictive Model Selection Criterion , 1984 .

[9]  Elaine Lanza,et al.  Application for Near Infrared Spectroscopy for Predicting the Sugar Content of Fruit Juices , 1984 .

[10]  Bayesian Variable Selection in Linear Regression: Comment , 1988 .

[11]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[12]  Clifford H. Spiegelman,et al.  Chemometrics and spectral frequency selection , 1991, Philosophical Transactions of the Royal Society of London. Series A: Physical and Engineering Sciences.

[13]  Philip J. Brown,et al.  Wavelength selection in multicomponent near‐infrared calibration , 1992 .

[14]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[15]  M. Almond,et al.  Book reviewPractical NIR spectroscopy: By B. G. Osborne, T. Fearn & P. H. Hindle. Longmans, UK, 1993. 227pp. ISBN 0582-099463. Price: £65.00 , 1994 .

[16]  D. Madigan,et al.  Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam's Window , 1994 .

[17]  J. York,et al.  Bayesian Graphical Models for Discrete Data , 1995 .

[18]  David Draper,et al.  Assessment and Propagation of Model Uncertainty , 2011 .

[19]  Bo Karlberg,et al.  Determination of nitrate in municipal waste water by UV spectroscopy , 1995 .

[20]  L. Gleser Measurement, Regression, and Calibration , 1996 .

[21]  M. Clyde,et al.  Prediction via Orthogonalized Model Mixing , 1996 .

[22]  L. Breiman Heuristics of instability and stabilization in model selection , 1996 .

[23]  Trevor Hastie,et al.  Predicting multivariate responses in multiple linear regression - Discussion , 1997 .

[24]  D. Madigan,et al.  Bayesian Model Averaging for Linear Regression Models , 1997 .

[25]  J. Friedman,et al.  Predicting Multivariate Responses in Multiple Linear Regression , 1997 .

[26]  E. George,et al.  APPROACHES FOR BAYESIAN VARIABLE SELECTION , 1997 .

[27]  T. Fearn,et al.  Bayesian wavelength selection in multicomponent analysis , 1998 .

[28]  T. Fearn,et al.  Multivariate Bayesian variable selection and prediction , 1998 .

[29]  T. Fearn,et al.  The choice of variables in multivariate regression: a non-conjugate Bayesian decision theory approach , 1999 .

[30]  Rolf Sundberg,et al.  Multivariate Calibration — Direct and Indirect Regression Methodology , 1999 .

[31]  Adrian E. Raftery,et al.  Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors , 1999 .

[32]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[33]  M. Clyde,et al.  Flexible empirical Bayes estimation for wavelets , 2000 .

[34]  Mike West,et al.  Bayesian Regression Analysis in the "Large p, Small n" Paradigm with Application in DNA Microarray S , 2000 .

[35]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[36]  T. Fearn,et al.  Bayesian Wavelet Regression on Curves With Application to a Spectroscopic Calibration Problem , 2001 .

[37]  M. Steel,et al.  Benchmark Priors for Bayesian Model Averaging , 2001 .

[38]  Marina Vannucci,et al.  Predictor Selection for Model Averaging , 2001 .

[39]  J. Berger,et al.  Optimal predictive model selection , 2004, math/0406464.