In this paper we present and evaluate a new algorithm for supervised learning regression problems. The algorithm combines a univariate regression tree with a linear regression function by means of constructive induction. When growing the tree, at each internal node, a linear-regression function creates one new attribute. This new attribute is the instantiation of the regression function for each example that fall at this node. This new instance space is propagated down through the tree. Tests based on those new attributes correspond to an oblique decision surface. Our approach can be seen as a hybrid model that combines a linear regression known to have low variance with a regression tree known to have low bias. Our algorithm was compared against to its components, and two simplified versions, and M5 using 16 benchmark datasets. The experimental evaluation shows that our algorithm has clear advantages with respect to the generalization ability when compared against its components and competes well against the state-of-art in regression trees.
[1]
Aram Karalic,et al.
Employing Linear Regression in Regression Tree Leaves
,
1992,
ECAI.
[2]
J. Ross Quinlan,et al.
Combining Instance-Based and Model-Based Learning
,
1993,
ICML.
[3]
Luís Torgo,et al.
Functional Models for Regression Tree Leaves
,
1997,
ICML.
[4]
L. Torgo,et al.
Inductive learning of tree-based regression models
,
1999
.
[5]
João Gama,et al.
Probabilistic Linear Tree
,
1997,
ICML.
[6]
Leo Breiman,et al.
Classification and Regression Trees
,
1984
.
[7]
Carla E. Brodley,et al.
Linear Machine Decision Trees
,
1991
.
[8]
Ian H. Witten,et al.
Data mining: practical machine learning tools and techniques with Java implementations
,
2002,
SGMD.
[9]
Simon Kasif,et al.
A System for Induction of Oblique Decision Trees
,
1994,
J. Artif. Intell. Res..
[10]
L. Breiman.
Arcing classifier (with discussion and a rejoinder by the author)
,
1998
.
[11]
David H. Wolpert,et al.
Stacked generalization
,
1992,
Neural Networks.