Automatic Bayesian model averaging for linear regression and applications in Bayesian curve fitting

With the development of MCMC methods, Bayesian methods play a more and more important role in model selection and statistical prediction. However, the sensitivity of the methods to prior distributions has caused much difficulty to users. In the context of multiple linear regression, we propose an automatic prior setting, in which there is no parameter to be specified by users. Under the prior setting, we show that sampling from the posterior distribution is approximately equivalent to sampling from a Boltzmann distribution defined on Cp values. The numerical results show that the Bayesian model averaging procedure resulted from the au- tomatic prior settin provides a significant improvement in predictive performance over other two procedures proposed in the literature. The procedure is extended to the problem of Bayesian curve fitting with regression splines. Evolutionary Monte Carlo is used to sample from the posterior distributions.

[1]  Alan E. Gelfand,et al.  Model Determination using sampling-based methods , 1996 .

[2]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[3]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[4]  A. O'Hagan,et al.  Fractional Bayes factors for model comparison , 1995 .

[5]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[6]  E. George,et al.  APPROACHES FOR BAYESIAN VARIABLE SELECTION , 1997 .

[7]  D. Madigan,et al.  Correction to: ``Bayesian model averaging: a tutorial'' [Statist. Sci. 14 (1999), no. 4, 382--417; MR 2001a:62033] , 2000 .

[8]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[9]  C. J. Stone,et al.  Logspline Density Estimation for Censored Data , 1992 .

[10]  Jianqing Fan,et al.  Data‐Driven Bandwidth Selection in Local Polynomial Fitting: Variable Bandwidth and Spatial Adaptation , 1995 .

[11]  Dean P. Foster,et al.  The risk inflation criterion for multiple regression , 1994 .

[12]  L. Tierney Markov Chains for Exploring Posterior Distributions , 1994 .

[13]  K. Hukushima,et al.  Exchange Monte Carlo Method and Application to Spin Glass Simulations , 1995, cond-mat/9512035.

[14]  Walter R. Gilks,et al.  Bayesian model comparison via jump diffusions , 1995 .

[15]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[16]  Hong Chang,et al.  Model Determination Using Predictive Distributions with Implementation via Sampling-Based Methods , 1992 .

[17]  C. Mallows More comments on C p , 1995 .

[18]  I. Ehrlich Participation in Illegitimate Activities: A Theoretical and Empirical Investigation , 1973, Journal of Political Economy.

[19]  M. Steel,et al.  Benchmark Priors for Bayesian Model Averaging , 2001 .

[20]  Charles Kooperberg,et al.  Spline Adaptation in Extended Linear Models (with comments and a rejoinder by the authors , 2002 .

[21]  R. Tibshirani,et al.  Generalized Additive Models , 1991 .

[22]  Robert Kohn,et al.  The Performance of Cross-Validation and Maximum Likelihood Estimators of Spline Smoothing Parameters , 1991 .

[23]  Ja-Yong Koo,et al.  Spline Estimation of Discontinuous Regression Functions , 1997 .

[24]  A. Raftery Approximate Bayes factors and accounting for model uncertainty in generalised linear models , 1996 .

[25]  J. York,et al.  Bayesian Graphical Models for Discrete Data , 1995 .

[26]  H. Müller,et al.  Variable Bandwidth Kernel Estimators of Regression Curves , 1987 .

[27]  I. Good,et al.  Probability and the Weighting of Evidence. , 1951 .

[28]  B. Carlin,et al.  Bayesian Model Choice Via Markov Chain Monte Carlo Methods , 1995 .

[29]  M. Hansen,et al.  Spline Adaptation in Extended Linear Models , 1998 .

[30]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[31]  Constantino Tsallis,et al.  Optimization by Simulated Annealing: Recent Progress , 1995 .

[32]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[33]  C. Geyer Markov Chain Monte Carlo Maximum Likelihood , 1991 .

[34]  J. Geweke,et al.  Variable selection and model comparison in regression , 1994 .

[35]  S. Sardy,et al.  Triogram Models , 1996 .

[36]  R. Kohn,et al.  A new algorithm for spline smoothing based on smoothing a stochastic process , 1987 .

[37]  Yazhen Wang Jump and sharp cusp detection by wavelets , 1995 .

[38]  J. Berger,et al.  The Intrinsic Bayes Factor for Model Selection and Prediction , 1996 .

[39]  Fulvio Spezzaferri,et al.  A Predictive Model Selection Criterion , 1984 .

[40]  R. Eubank Nonparametric Regression and Spline Smoothing , 1999 .

[41]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[42]  J. Friedman,et al.  FLEXIBLE PARSIMONIOUS SMOOTHING AND ADDITIVE MODELING , 1989 .

[43]  Faming Liang,et al.  EVOLUTIONARY MONTE CARLO: APPLICATIONS TO Cp MODEL SAMPLING AND CHANGE POINT PROBLEM , 2000 .

[44]  D. Madigan,et al.  Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam's Window , 1994 .

[45]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[46]  Adrian F. M. Smith,et al.  Automatic Bayesian curve fitting , 1998 .

[47]  D. E. Goldberg,et al.  Genetic Algorithms in Search, Optimization & Machine Learning , 1989 .

[48]  C. L. Mallows Some comments on C_p , 1973 .

[49]  David M. Allen,et al.  The Relationship Between Variable Selection and Data Agumentation and a Method for Prediction , 1974 .

[50]  R. R. Hocking The analysis and selection of variables in linear regression , 1976 .

[51]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[52]  S. Q. s3idChMn,et al.  Evolutionary Monte Carlo: Applications to C_p Model Sampling and Change Point Problem , 2000 .

[53]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[54]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[55]  David Draper,et al.  Assessment and Propagation of Model Uncertainty , 2011 .

[56]  D. Madigan,et al.  Bayesian Model Averaging for Linear Regression Models , 1997 .

[57]  S. Geisser,et al.  A Predictive Approach to Model Selection , 1979 .

[58]  Purushottam W. Laud,et al.  Predictive Model Selection , 1995 .