Functional Trees for Regression

In this paper we present and evaluate a new algorithm for supervised learning regression problems. The algorithm combines a univariate regression tree with a linear regression function by means of constructive induction. When growing the tree, at each internal node, a linear-regression function creates one new attribute. This new attribute is the instantiation of the regression function for each example that fall at this node. This new instance space is propagated down through the tree. Tests based on those new attributes correspond to an oblique decision surface. Our approach can be seen as a hybrid model that combines a linear regression known to have low variance with a regression tree known to have low bias. Our algorithm was compared against to its components, and two simplified versions, and M5 using 16 benchmark datasets. The experimental evaluation shows that our algorithm has clear advantages with respect to the generalization ability when compared against its components and competes well against the state-of-art in regression trees.