Multiclass Classification with Multi-Prototype Support Vector Machines

Winner-take-all multiclass classifiers are built on the top of a set of prototypes each representing one of the available classes. A pattern is then classified with the label associated to the most 'similar' prototype. Recent proposal of SVM extensions to multiclass can be considered instances of the same strategy with one prototype per class.The multi-prototype SVM proposed in this paper extends multiclass SVM to multiple prototypes per class. It allows to combine several vectors in a principled way to obtain large margin decision functions. For this problem, we give a compact constrained quadratic formulation and we propose a greedy optimization algorithm able to find locally optimal solutions for the non convex objective function.This algorithm proceeds by reducing the overall problem into a series of simpler convex problems. For the solution of these reduced problems an efficient optimization algorithm is proposed. A number of pattern selection strategies are then discussed to speed-up the optimization process. In addition, given the combinatorial nature of the overall problem, stochastic search strategies are suggested to escape from local minima which are not globally optimal.Finally, we report experiments on a number of datasets. The performance obtained using few simple linear prototypes is comparable to that obtained by state-of-the-art kernel-based methods but with a significant reduction (of one or two orders) in response time.

[1]  Bernhard Schölkopf,et al.  Extracting Support Data for a Given Task , 1995, KDD.

[2]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[3]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[4]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[5]  Jorma Laaksonen,et al.  LVQ_PAK: The Learning Vector Quantization Program Package , 1996 .

[6]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[7]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[8]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[9]  Tom Downs,et al.  Exact Simplification of Support Vector Solutions , 2002, J. Mach. Learn. Res..

[10]  Alessandro Sperduti,et al.  Multi-prototype Support Vector Machine , 2003, IJCAI.

[11]  Hélène Paugam-Moisy,et al.  A new multi-class SVM based on a uniform convergence result , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[12]  Koby Crammer,et al.  On the Learnability and Design of Output Codes for Multiclass Problems , 2002, Machine Learning.

[13]  Helena Ramalhinho Dias Lourenço,et al.  Iterated Local Search , 2001, Handbook of Metaheuristics.

[14]  Alessandro Sperduti,et al.  A re-weighting strategy for improving margins , 2002, Artif. Intell..

[15]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[16]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[17]  S. Sathiya Keerthi,et al.  Improvements to Platt's SMO Algorithm for SVM Classifier Design , 2001, Neural Computation.

[18]  Alessandro Sperduti,et al.  Discriminant Pattern Recognition Using Transformation-Invariant Neurons , 2000, Neural Computation.

[19]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[20]  Alessandro Sperduti,et al.  An efficient SMO-like algorithm for multiclass SVM , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[21]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .

[22]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[23]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[24]  Alexander J. Smola,et al.  Minimal Kernel Classifiers , 2002, J. Mach. Learn. Res..

[25]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..