A hierarchical method for multi-class support vector machines

We introduce a framework, which we call Divide-by-2 (DB2), for extending support vector machines (SVM) to multi-class problems. DB2 offers an alternative to the standard one-against-one and one-against-rest algorithms. For an N class problem, DB2 produces an N − 1 node binary decision tree where nodes represent decision boundaries formed by N − 1 SVM binary classifiers. This tree structure allows us to present a generalization and a time complexity analysis of DB2. Our analysis and related experiments show that, DB2 is faster than one-against-one and one-against-rest algorithms in terms of testing time, significantly faster than one-against-rest in terms of training time, and that the cross-validation accuracy of DB2 is comparable to these two methods.

[1]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[2]  Nello Cristianini,et al.  Enlarging the Margins in Perceptron Decision Trees , 2000, Machine Learning.

[3]  Koby Crammer,et al.  On the Learnability and Design of Output Codes for Multiclass Problems , 2002, Machine Learning.

[4]  Susan T. Dumais,et al.  Hierarchical classification of Web content , 2000, SIGIR '00.

[5]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[6]  E. Forgy,et al.  Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[7]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[8]  Gérard Dreyfus,et al.  Single-layer learning revisited: a stepwise procedure for building and training a neural network , 1989, NATO Neurocomputing.

[9]  J. Weston,et al.  Support Vector Machines for Multi-class Pattern Recognition 1. K-class Pattern Recognition 2. Solving K-class Problems with Binary Svms , 1999 .

[10]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[11]  Carla E. Brodley,et al.  Visualization and interactive feature selection for unsupervised data , 2000, KDD '00.

[12]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[13]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[14]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[15]  Jason Weston,et al.  Support vector machines for multi-class pattern recognition , 1999, ESANN.

[16]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .