Oblique Linear Tree

In this paper we present system Ltree for proposicional supervised learning. Ltree is able to define decision surfaces both orthogonal and oblique to the axes defined by the attributes of the input space. This is done combining a decision tree with a linear discriminant by means of constructive induction. At each decision node Ltree defines a new instance space by insertion of new attributes that are projections of the examples that fall at this node over the hyper-planes given by a linear discriminant function. This new instance space is propagated down through the tree. Tests based on those new attributes are oblique with respect to the original input space. Ltree is a probabilistic tree in the sense that it outputs a class probability distribution for each query example. The class probability distribution is computed at learning time, taking into account the different class distributions on the path from the root to the actual node. We have carried out experiments on sixteen benchmark datasets and compared our system with other well known decision tree systems (orthogonal and oblique) like C4.5, OC1 and LMDT. On these datasets we have observed that our system has advantages in what concerns accuracy and tree size at statistically significant confidence levels.

[1]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[2]  Donato Malerba,et al.  Decision Tree Pruning as a Search in the State Space , 1993, ECML.

[3]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[4]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[5]  W. Loh,et al.  Tree-Structured Classification via Generalized Discriminant Analysis. , 1988 .

[6]  J. R. Quinlan Learning With Continuous Classes , 1992 .

[7]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[8]  Leo Breiman,et al.  Bias, Variance , And Arcing Classifiers , 1996 .

[9]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[10]  C. Matheus A constructive induction framework , 1989, ICML 1989.

[11]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[12]  Geoffrey I. Webb,et al.  Incorporating canonical discriminant attributes in classification learning , 1994 .

[13]  R. Cranley,et al.  Multivariate Analysis—Methods and Applications , 1985 .

[14]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[15]  Wray L. Buntine,et al.  A theory of learning classification rules , 1990 .

[16]  Larry A. Rendell,et al.  Constructive Induction On Decision Trees , 1989, IJCAI.

[17]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .