Efficient Metropolis–Hastings Proposal Mechanisms for Bayesian Regression Tree Models

Bayesian regression trees are flexible non-parametric models that are well suited to many modern statistical regression problems. Many such tree models have been proposed, from the simple single- tree model to more complex tree ensembles. Their non-parametric formulation allows for effective and efficient modeling of datasets exhibiting complex non-linear relationships between the model pre- dictors and observations. However, the mixing behavior of the Markov Chain Monte Carlo (MCMC) sampler is sometimes poor. This is because the proposals in the sampler are typically local alterations of the tree structure, such as the birth/death of leaf nodes, which does not allow for efficient traversal of the model space. This poor mixing can lead to inferential problems, such as under-representing uncertainty. In this paper, we develop novel proposal mechanisms for efficient sampling. The first is a rule perturbation proposal while the second we call tree rotation. The perturbation proposal can be seen as an efficient variation of the change proposal found in existing literature. The novel tree rotation proposal is simple to implement as it only requires local changes to the regression tree structure, yet it efficiently traverses disparate regions of the model space along contours of equal probability. When combined with the classical birth/death proposal, the resulting MCMC sampler exhibits good acceptance rates and properly represents model uncertainty in the posterior samples. We implement this sampling algorithm in the Bayesian Additive Regression Tree (BART) model and demonstrate its effectiveness on a prediction problem from computer experiments and a test function where structural tree variability is needed to fully explore the posterior.

[1]  R. Tarjan,et al.  Rotation distance, triangulations, and hyperbolic geometry , 1986, STOC '86.

[2]  Henry P. Wynn,et al.  [Design and Analysis of Computer Experiments]: Rejoinder , 1989 .

[3]  H. Chipman,et al.  Bayesian CART Model Search , 1998 .

[4]  Stephen L. Rathbun,et al.  Spatial modelling in irregularly shaped regions: Kriging estuaries , 1998 .

[5]  Adrian F. M. Smith,et al.  A Bayesian CART algorithm , 1998 .

[6]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[7]  A. O'Hagan,et al.  Bayesian calibration of computer models , 2001 .

[8]  A. O'Hagan,et al.  Bayesian inference for the uncertainty distribution of computer model outputs , 2002 .

[9]  Anders Løland,et al.  Spatial covariance modelling in a complex coastal domain by multidimensional scaling , 2003 .

[10]  Edward I. George,et al.  Bayesian Treed Models , 2002, Machine Learning.

[11]  Matthew Timothy Pratola Design on non-convex regions: Optimal experiments for spatial process prediction , 2006 .

[12]  Robert B. Gramacy,et al.  Ja n 20 08 Bayesian Treed Gaussian Process Models with an Application to Computer Modeling , 2009 .

[13]  Michael A. West,et al.  Bayesian CART: Prior Specification and Posterior Simulation , 2007 .

[14]  D. Higdon,et al.  Computer Model Calibration Using High-Dimensional Output , 2008 .

[15]  Robert B. Gramacy,et al.  Dynamic Trees for Learning and Design , 2009, 0912.1586.

[16]  H. Chipman,et al.  BART: Bayesian Additive Regression Trees , 2008, 0806.3286.

[17]  James R. Gattiker,et al.  Parallel Bayesian Additive Regression Trees , 2013, 1309.1906.