Bayesian Additive Regression Trees(BART) is a Bayesian nonparametric approach which has been shown to be competitive with the best modern predictive methods such as random forest and Gradient Boosting Decision Tree.The sum of trees structure combined with a Bayesian inferential framework provide a accurate and robust statistic method.BART variant named SBART using randomized decision trees has been developed and show practical benefits compared to BART. The primary bottleneck of SBART is the speed to compute the sufficient statistics and the publicly avaiable implementation of the SBART algorithm in the R package is very slow.In this paper we show how the SBART algorithm can be modified and computed using single program,multiple data(SPMD) distributed computation with the Message Passing Interface(MPI) library.This approach scales nearly linearly in the number of processor cores, enabling the practitioner to perform statistical inference on massive datasets. Our approach can also handle datasets too massive to fit on any single data repository.We have made modification to this algorithm to make it capable to handle classfication problem which can not be done with the original R package.With data experiments we show the advantage of distributed SBART for classfication problem compared to BART.
[1]
Tianqi Chen,et al.
XGBoost: A Scalable Tree Boosting System
,
2016,
KDD.
[2]
Yun Yang,et al.
Bayesian regression tree ensembles that adapt to smoothness and sparsity
,
2017,
Journal of the Royal Statistical Society: Series B (Statistical Methodology).
[3]
H. Chipman,et al.
BART: Bayesian Additive Regression Trees
,
2008,
0806.3286.
[4]
H. Chipman,et al.
Bayesian Additive Regression Trees
,
2006
.
[5]
Sigurdur Geirsson.
Parallel Bayesian Additive Regression Trees, using Apache Spark
,
2017
.
[6]
James R. Gattiker,et al.
Parallel Bayesian Additive Regression Trees
,
2013,
1309.1906.
[7]
Veronika Rockova,et al.
Submitted to the Annals of Applied Statistics POSTERIOR CONCENTRATION FOR BAYESIAN REGRESSION TREES AND FORESTS
,
2019
.
[8]
P. Baldi,et al.
Searching for exotic particles in high-energy physics with deep learning
,
2014,
Nature Communications.