Learning Decision Trees with Stochastic Linear Classifiers

In this work we propose a top-down decision tree learning algorithm with a class of linear classifiers called stochastic linear classifiers as the internal nodes’ hypothesis class. To this end, we derive efficient algorithms for minimizing the Gini index for this class for each internal node, although the problem is non-convex. Moreover, the proposed algorithm has a theoretical guarantee under the weak stochastic hypothesis assumption.

[1]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[2]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[3]  Ronald L. Rivest,et al.  Training a 3-node neural network is NP-complete , 1988, COLT '88.

[4]  Simon Kasif,et al.  Induction of Oblique Decision Trees , 1993, IJCAI.

[5]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[6]  K. Bennett,et al.  A support vector machine approach to decision trees , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[7]  Yishay Mansour,et al.  On the Boosting Ability of Top-Down Decision Tree Learning Algorithms , 1999, J. Comput. Syst. Sci..

[8]  Chandrika Kamath,et al.  Inducing oblique decision trees with evolutionary algorithms , 2003, IEEE Trans. Evol. Comput..

[9]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[10]  Hans Burkhardt,et al.  Fast Support Vector Machine Classification of Very Large Datasets , 2007, GfKl.

[11]  Frank Nielsen,et al.  Real Boosting a la Carte with an Application to Boosting Oblique Decision Tree , 2007, IJCAI.

[12]  Mark Braverman,et al.  The complexity of properly learning simple concept classes , 2008, J. Comput. Syst. Sci..

[13]  Frank Nielsen,et al.  Bregman Divergences and Surrogates for Learning , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Carlos Santa Cruz,et al.  Hierarchical linear support vector machine , 2012, Pattern Recognit..

[15]  Saso Dzeroski,et al.  Hybrid Decision Tree Architecture Utilizing Local SVMs for Efficient Multi-Label Learning , 2013, Int. J. Pattern Recognit. Artif. Intell..