Multi-Class Support Vector Machine

Support vector machine (SVM) was initially designed for binary classification. To extend SVM to the multi-class scenario, a number of classification models were proposed such as the one by Crammer and Singer (J Mach Learn Res 2:265–292, 2001). However, the number of variables in Crammer and Singer’s dual problem is the product of the number of samples (l) by the number of classes (k), which produces a large computational complexity. This chapter sorts the existing classical techniques for multi-class SVM into the indirect and direct ones and further gives the comparison for them in terms of theory and experiments. Especially, this chapter exhibits a new Simplified Multi-class SVM (SimMSVM) that reduces the size of the resulting dual problem from l × k to l by introducing a relaxed classification error bound. The experimental discussion demonstrates that the SimMSVM approach can greatly speed up the training process, while maintaining a competitive classification accuracy.

[1]  Gérard Dreyfus,et al.  Single-layer learning revisited: a stepwise procedure for building and training a neural network , 1989, NATO Neurocomputing.

[2]  Ching Y. Suen,et al.  Historical review of OCR research and development , 1992, Proc. IEEE.

[3]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[5]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[6]  Jason Weston,et al.  Multi-Class Support Vector Machines , 1998 .

[7]  Kristin P. Bennett,et al.  Multicategory Classification by Support Vector Machines , 1999, Comput. Optim. Appl..

[8]  Johan A. K. Suykens,et al.  Multiclass least squares support vector machines , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[9]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[10]  David R. Musicant,et al.  Successive overrelaxation for support vector machines , 1999, IEEE Trans. Neural Networks.

[11]  Robert P. W. Duin,et al.  Data domain description using support vectors , 1999, ESANN.

[12]  Ulrich H.-G. Kreßel,et al.  Pairwise classification and support vector machines , 1999 .

[13]  David R. Musicant,et al.  Lagrangian Support Vector Machines , 2001, J. Mach. Learn. Res..

[14]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[15]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[16]  Glenn Fung,et al.  Proximal support vector machine classifiers , 2001, KDD '01.

[17]  Yann Guermeur,et al.  Combining Discriminant Models with New Multi-Class SVMs , 2002, Pattern Analysis & Applications.

[18]  Pierre Baldi,et al.  A Machine-Learning Strategy for Protein Analysis , 2002, IEEE Intell. Syst..

[19]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[20]  Miguel Figueroa,et al.  Competitive learning with floating-gate circuits , 2002, IEEE Trans. Neural Networks.

[21]  Yi Lin Multicategory Support Vector Machines, Theory, and Application to the Classification of . . . , 2003 .

[22]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[23]  John Shawe-Taylor,et al.  Multiclass classification by L1 norm Support Vector Machine , 2004 .

[24]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[25]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[26]  G. Wahba,et al.  Multicategory Support Vector Machines , Theory , and Application to the Classification of Microarray Data and Satellite Radiance Data , 2004 .

[27]  Chih-Jen Lin,et al.  A Simple Decomposition Method for Support Vector Machines , 2002, Machine Learning.

[28]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[29]  Joseph Picone,et al.  Applications of support vector machines to speech recognition , 2004, IEEE Transactions on Signal Processing.

[30]  Glenn Fung,et al.  Multicategory Proximal Support Vector Machine Classifiers , 2005, Machine Learning.

[31]  Bhavani M. Thuraisingham,et al.  A new intrusion detection system using support vector machines and hierarchical clustering , 2007, The VLDB Journal.

[32]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[33]  Yufeng Liu,et al.  Fisher Consistency of Multicategory Support Vector Machines , 2007, AISTATS.

[34]  Xiaotong Shen,et al.  On L1-Norm Multiclass Support Vector Machines , 2007 .

[35]  Kang Li,et al.  A sparse multi-class Least-Squares Support Vector Machine , 2008, 2008 IEEE International Symposium on Industrial Electronics.

[36]  Yanqing Zhang,et al.  SVMs Modeling for Highly Imbalanced Classification , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[37]  Xiangyang Xue,et al.  A simplified multi-class support vector machine with reduced dual optimization , 2012, Pattern Recognit. Lett..

[38]  Chih-Jen Lin,et al.  A Comparison of Methods for Multi-class Support Vector Machines , 2015 .