We present a new regression algorithm called Groves of trees and show empirically that it is superior in performance to a number of other established regression methods. A Grove is an additive model usually containing a small number of large trees. Trees added to the Grove are trained on the residual error of other trees already in the Grove. We begin the training process with a single small tree in the Grove and gradually increase both the number of trees in the Grove and their size. This procedure ensures that the resulting model captures the additive structure of the response. A single Grove may still overfit to the training set, so we further decrease the variance of the final predictions with bagging. We show that in addition to exhibiting superior performance on a suite of regression test problems, bagged Groves of trees are very resistant to overfitting.
[1]
Leo Breiman,et al.
Bagging Predictors
,
1996,
Machine Learning.
[2]
Eric R. Ziegel,et al.
The Elements of Statistical Learning
,
2003,
Technometrics.
[3]
J. Friedman.
Greedy function approximation: A gradient boosting machine.
,
2001
.
[4]
Tom Bylander,et al.
Estimating Generalization Error on Two-Class Datasets Using Out-of-Bag Estimates
,
2002,
Machine Learning.
[5]
Rui Camacho,et al.
Inducing Models of human Control Skills
,
1998,
ECML.
[6]
Giles Hooker,et al.
Discovering additive structure in black box functions
,
2004,
KDD.
[7]
Edward I. George,et al.
Bayesian Ensemble Learning
,
2006,
NIPS.
[8]
J. Friedman.
Stochastic gradient boosting
,
2002
.