Bayesian Testing, Variable Selection and Model Averaging in Linear Models using R with BayesVarSel

In this paper, objective Bayesian methods for hypothesis testing and variable selection in linear models are considered. The focus is on BayesVarSel, an R package that computes posterior probabilities of hypotheses/models and provides a suite of tools to properly summarize the results. We introduce the usage of specific functions to compute several types of model averaging estimations and predictions weighted by posterior probabilities. BayesVarSel contains exact algorithms to perform fast computations in problems of small to moderate size and heuristic sampling methods to solve large problems. We illustrate the functionalities of the package with several data examples. An illustrated overview of BayesVarSel Testing and variable selection problems are taught in almost any introductory statistical course. In this first section we assume such background to present the essence of the Bayesian approach and the basic usage of BayesVarSel (Garcia-Donato and Forte, 2015) with hardly any mathematical formulas. Our motivating idea in this first section is mainly to present the appeal of the Bayesian answers to a very broad spectrum of applied researchers. This introductory section concludes with a discussion about connections with potentially related R packages. The remaining six sections are organized as follows. In second section, on page 158, the problem is presented and the notation needed is introduced jointly with the basics of the Bayesian methodology. Then, two sections follow with explanations of the details concerning the obtention of posterior probabilities in hypothesis testing and variable selection problems, respectively, in BayesVarSel. In a later section, on page 163, several tools to describe the posterior distribution are explained, while the section on page 166 is devoted to model averaging techniques. The paper concludes with a section with plans for the future of the BayesVarSel project. This paper is supplemented with an appendix, with formulas for the most delicate ingredient in the underlying problem in BayesVarSel namely the prior distributions for parameters within each model. The current version of BayesVarSel, here presented, is 1.8.0.

[1]  I. Ehrlich Participation in Illegitimate Activities: A Theoretical and Empirical Investigation , 1973, Journal of Political Economy.

[2]  G. García-Donato,et al.  On Sampling Strategies in Bayesian Variable Selection Problems With Large Model Spaces , 2013 .

[3]  M. J. Bayarri,et al.  Extending conventional priors for testing general hypotheses in linear models , 2007 .

[4]  Purushottam W. Laud Bayesian Statistics: An Introduction, second edition. Peter M. Lee, Arnold, 1997. No. of pages: xii+344. Price: £19.99. ISBN 0-340-67785-6 , 1999 .

[5]  James O. Berger,et al.  Objective Bayesian Methods for Model Selection: Introduction and Comparison , 2001 .

[6]  M. Clyde,et al.  Mixtures of g Priors for Bayesian Variable Selection , 2008 .

[7]  L. Wasserman,et al.  A Reference Bayesian Test for Nested Hypotheses and its Relationship to the Schwarz Criterion , 1995 .

[8]  A. Raftery Bayesian Model Selection in Social Research , 1995 .

[9]  A. Zellner,et al.  Basic Issues in Econometrics. , 1986 .

[10]  Christina Gloeckner,et al.  Modern Applied Statistics With S , 2003 .

[11]  Richard D. Morey,et al.  Baysefactor: Computation of Bayes Factors for Common Designs , 2018 .

[12]  Julian J. Faraway,et al.  Practical Regression and Anova using R , 2002 .

[13]  Anabel Forte,et al.  Methods and Tools for Bayesian Variable Selection and Model Averaging in Univariate Linear Regression , 2016, 1612.02357.

[14]  W. W. Muir,et al.  Regression Diagnostics: Identifying Influential Data and Sources of Collinearity , 1980 .

[15]  L. Joseph,et al.  Bayesian Statistics: An Introduction , 1989 .

[16]  D. Madigan,et al.  Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam's Window , 1994 .

[17]  Laura Deldossi,et al.  Objective Bayesian model discrimination in follow-up experimental designs , 2014, 1405.2818.

[18]  J. Bernardo,et al.  THE FORMAL DEFINITION OF REFERENCE PRIORS , 2009, 0904.0156.

[19]  J. Berger,et al.  Optimal predictive model selection , 2004, math/0406464.

[20]  X. Sala-i-Martin,et al.  Determinants of Long-Term Growth: A Bayesian Averaging of Classical Estimates (Bace) Approach , 2000 .

[21]  M. Steel,et al.  Jointness in Bayesian Variable Selection with Applications to Growth Regression , 2006 .

[22]  D. Dey,et al.  A First Course in Linear Model Theory , 2001 .

[23]  A. Zellner,et al.  Posterior odds ratios for selected regression hypotheses , 1980 .

[24]  M. Steel,et al.  Benchmark Priors for Bayesian Model Averaging , 2001 .

[25]  E. George,et al.  APPROACHES FOR BAYESIAN VARIABLE SELECTION , 1997 .

[26]  James G. Scott,et al.  An exploration of aspects of Bayesian multiple testing , 2006 .

[27]  J. Berger The case for objective Bayesian analysis , 2006 .

[28]  Eduardo Ley,et al.  On the Effect of Prior Assumptions in Bayesian Model Averaging With Applications to Growth Regression , 2007 .

[29]  Martin Feldkircher,et al.  Bayesian model averaging employing fixed and flexible priors: The BMS package for R , 2015 .

[30]  D. Madigan,et al.  Bayesian Model Averaging for Linear Regression Models , 1997 .

[31]  L. M. M.-T. Theory of Probability , 1929, Nature.

[32]  M. J. Bayarri,et al.  Criteria for Bayesian model choice with application to variable selection , 2012, 1209.5240.