Support Vector Machine Decision Trees with Rare Event Detection

Model selection and rare event detection are two major problems frequently encountered in support vector learning. This paper proposes a new support vector learning algorithm, to be referred to as Linear Support Vector Machine Decision Tree (LSVM-DT), that solves the two problems. It consists of a binary tree structure with linear support vector machines in all tree nodes and class labels in all leaves. During training, multiple linear hyperplanes are constructed while traversing down the tree. The LSVM-DT is capable of separating both linearly and nonlinearly separable data. The only parameter that needs to be chosen by the user is the regularization parameter C , thus eliminating the model selection problem. Its built-in rare event detection mechanism allows the LSVM-DT to solve classification problems with underrepresented or unproportional classes. This phenomenon occurs within the tree even if the classes initially have equal prior probabilities. Experiments with different data sets show that the LSVM-DT achieves comparable performance with regular decision tree, polynomial, and Gaussian SVMs. The LSVM-DT can be generalized to Support Vector Machine Decision Tree (SVM-DT) by replacing the linear SVM in each node with a nonlinear SVM.

[1]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[2]  Jon Atli Benediktsson,et al.  Neural Network Approaches Versus Statistical Methods in Classification of Multisource Remote Sensing Data , 1989, 12th Canadian Symposium on Remote Sensing Geoscience and Remote Sensing Symposium,.

[3]  Carla E. Brodley,et al.  Linear Machine Decision Trees , 1991 .

[4]  LiMin Fu,et al.  Rule Generation from Neural Networks , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[5]  Donato Malerba,et al.  A Comparative Analysis of Methods for Pruning Decision Trees , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Federico Girosi,et al.  Support Vector Machines: Training and Applications , 1997 .

[7]  Federico Girosi,et al.  Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Raymond T. Ng,et al.  A Unified Notion of Outliers: Properties and Computation , 1997, KDD.

[9]  J. C. BurgesChristopher A Tutorial on Support Vector Machines for Pattern Recognition , 1998 .

[10]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[11]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[12]  Thorsten Joachims,et al.  Text categorization with support vector machines , 1999 .

[13]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[14]  Kristin P. Bennett,et al.  On support vector decision trees for database marketing , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[15]  Ulrich H.-G. Kreßel,et al.  Pairwise classification and support vector machines , 1999 .

[16]  Gunnar Rätsch,et al.  Engineering Support Vector Machine Kerneis That Recognize Translation Initialion Sites , 2000, German Conference on Bioinformatics.

[17]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[18]  Nello Cristianini,et al.  Large Margin Trees for Induction and Transduction , 1999, ICML.

[19]  Cheng-Chew Lim,et al.  Target detection in radar imagery using support vector machines with training size biasing , 2000 .

[20]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[21]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[22]  Jude W. Shavlik,et al.  Extracting Refined Rules from Knowledge-Based Neural Networks , 1993, Machine Learning.

[23]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.