Variable Selection Using Bayesian Additive Regression Trees

Variable selection is an important statistical problem. This problem becomes more challenging when the candidate predictors are of mixed type (e.g. continuous and binary) and impact the response variable in nonlinear and/or non-additive ways. In this paper, we review existing variable selection approaches for the Bayesian additive regression trees (BART) model, a nonparametric regression model, which is flexible enough to capture the interactions between predictors and nonlinear relationships with the response. An emphasis of this review is on the capability of identifying relevant predictors. We also propose two variable importance measures which can be used in a permutation-based variable selection approach, and a backward variable selection procedure for BART. We present simulations demonstrating that our approaches exhibit improved performance in terms of the ability to recover all the relevant predictors in a variety of data settings, compared to existing BART-based variable selection methods.

[1]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[2]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[3]  H. Chipman,et al.  BART: Bayesian Additive Regression Trees , 2008, 0806.3286.

[4]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[5]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[6]  V. Rocková,et al.  Variable selection with ABC Bayesian forests , 2018, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[7]  N. Pillai,et al.  Dirichlet–Laplace Priors for Optimal Shrinkage , 2014, Journal of the American Statistical Association.

[8]  Veronika Rockova,et al.  Submitted to the Annals of Applied Statistics POSTERIOR CONCENTRATION FOR BAYESIAN REGRESSION TREES AND FORESTS , 2019 .

[9]  H. Chipman,et al.  Bayesian CART Model Search , 1998 .

[10]  Thomas Lengauer,et al.  Permutation importance: a corrected feature importance measure , 2010, Bioinform..

[11]  Gilles Louppe,et al.  Understanding Random Forests , 2015 .

[12]  Giovanni Parmigiani,et al.  Bayesian Effect Estimation Accounting for Adjustment Uncertainty , 2012, Biometrics.

[13]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[14]  J. Friedman Stochastic gradient boosting , 2002 .

[15]  Venkat Reddy Konasani,et al.  Multiple Regression Analysis , 2015 .

[16]  Hans Knutsson,et al.  Reinforcement Learning Trees , 1996 .

[17]  Achim Zeileis,et al.  Bias in random forest variable importance measures: Illustrations, sources and a solution , 2007, BMC Bioinformatics.

[18]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[20]  Edward I. George,et al.  Variable selection for BART: An application to gene regulation , 2013, 1310.4887.

[21]  Robert E. McCulloch,et al.  Nonparametric Machine Learning and Efficient Computation with Bayesian Additive Regression Trees: The BART R Package , 2021, J. Stat. Softw..

[22]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[23]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[24]  J. Berger,et al.  Optimal predictive model selection , 2004, math/0406464.

[25]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[26]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[27]  Adam Kapelner,et al.  bartMachine: Machine Learning with Bayesian Additive Regression Trees , 2013, 1312.2171.

[28]  R. Tibshirani,et al.  Bayesian backfitting (with comments and a rejoinder by the authors , 2000 .

[29]  A. Linero Bayesian Regression Trees for High-Dimensional Prediction and Variable Selection , 2018 .

[30]  Aki Vehtari,et al.  Erratum to: Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC , 2017, Stat. Comput..

[31]  V. Rocková,et al.  Posterior Concentration for Bayesian Regression Trees and their Ensembles , 2017 .