Bayesian and Empirical Bayesian Forests

We derive ensembles of decision trees through a nonparametric Bayesian model, allowing us to view random forests as samples from a posterior distribution. This insight provides large gains in interpretability, and motivates a class of Bayesian forest (BF) algorithms that yield small but reliable performance gains. Based on the BF framework, we are able to show that high-level tree hierarchy is stable in large samples. This leads to an empirical Bayesian forest (EBF) algorithm for building approximate BFs on massive distributed datasets and we show that EBFs outperform subsampling based alternatives by a large margin.

[1]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[2]  D. Wolfe,et al.  Nonparametric Statistical Methods. , 1974 .

[3]  G. V. Kass An Exploratory Technique for Investigating Large Quantities of Categorical Data , 1980 .

[4]  D. Rubin The Bayesian Bootstrap , 1981 .

[5]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[6]  R. Kass,et al.  Approximate Bayesian Inference in Conditionally Independent Hierarchical Models (Parametric Empirical Bayes Models) , 1989 .

[7]  J. Friedman Multivariate adaptive regression splines , 1990 .

[8]  G. Imbens,et al.  Nonparametric Applications of Bayesian Inference , 1996 .

[9]  R. Pace,et al.  Sparse spatial autoregressions , 1997 .

[10]  H. Chipman,et al.  Bayesian CART Model Search , 1998 .

[11]  Pierre Geurts,et al.  Investigation and Reduction of Discretization Variance in Decision Tree Induction , 2000, ECML.

[12]  Herbert Lee,et al.  Bagging and the Bayesian Bootstrap , 2001, AISTATS.

[13]  W. Loh,et al.  REGRESSION TREES WITH UNBIASED VARIABLE SELECTION AND INTERACTION DETECTION , 2002 .

[14]  T. Lancaster A Note on Bootstraps and Robustness , 2003 .

[15]  Christina Gloeckner,et al.  Modern Applied Statistics With S , 2003 .

[16]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[17]  K. Hornik,et al.  Unbiased Recursive Partitioning: A Conditional Inference Framework , 2006 .

[18]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[19]  H. Chipman,et al.  Bayesian Additive Regression Trees , 2006 .

[20]  Robert B. Gramacy,et al.  Ja n 20 08 Bayesian Treed Gaussian Process Models with an Application to Computer Modeling , 2009 .

[21]  Roberto J. Bayardo,et al.  PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce , 2009, Proc. VLDB Endow..

[22]  Paulo Cortez,et al.  Modeling wine preferences by data mining from physicochemical properties , 2009, Decis. Support Syst..

[23]  Robert B. Gramacy,et al.  Dynamic Trees for Learning and Design , 2009, 0912.1586.

[24]  Bradley Efron,et al.  Large-scale inference , 2010 .

[25]  H. Chipman,et al.  BART: Bayesian Additive Regression Trees , 2008, 0806.3286.

[26]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[27]  D. Poirier Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap , 2011 .

[28]  Isaac Dialsingh,et al.  Large-scale inference: empirical Bayes methods for estimation, testing, and prediction , 2012 .

[29]  Pietro Perona,et al.  Quickly Boosting Decision Trees - Pruning Underachieving Features Early , 2013, ICML.

[30]  Matt Taddy,et al.  Heterogeneous Treatment Effects in Digital Experimentation , 2014 .

[31]  Matt Taddy,et al.  Heterogeneous Treatment Effects in Digital Experimentation , 2014, 1412.8563.