Multiclass Learning with Simplex Coding

In this paper we discuss a novel framework for multiclass learning, defined by a suitable coding/decoding strategy, namely the simplex coding, that allows to generalize to multiple classes a relaxation approach commonly used in binary classification. In this framework, a relaxation error analysis can be developed avoiding constraints on the considered hypotheses class. Moreover, we show that in this setting it is possible to derive the first provably consistent regularized method with training/tuning complexity which is independent to the number of classes. Tools from convex analysis are introduced that can be used beyond the scope of this paper.

[1]  Yufeng Liu,et al.  Fisher Consistency of Multicategory Support Vector Machines , 2007, AISTATS.

[2]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[3]  Nuno Vasconcelos,et al.  Multiclass Boosting: Theory and Algorithms , 2011, NIPS.

[4]  David Cox,et al.  Scaling up biologically-inspired computer vision: A case study in unconstrained face recognition on facebook , 2011, CVPR 2011 WORKSHOPS.

[5]  Mark D. Reid,et al.  Composite Binary Losses , 2009, J. Mach. Learn. Res..

[6]  Ambuj Tewari,et al.  On the Consistency of Multiclass Classification Methods , 2007, J. Mach. Learn. Res..

[7]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2007, ICML '07.

[8]  Steven A. Orszag,et al.  CBMS-NSF REGIONAL CONFERENCE SERIES IN APPLIED MATHEMATICS , 1978 .

[9]  G. Wahba,et al.  Multicategory Support Vector Machines , Theory , and Application to the Classification of Microarray Data and Satellite Radiance Data , 2004 .

[10]  G. Wahba Spline models for observational data , 1990 .

[11]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[12]  A. Tsybakov,et al.  Optimal aggregation of classifiers in statistical learning , 2003 .

[13]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[14]  A. Caponnetto,et al.  Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..

[15]  Y. Yao,et al.  On Early Stopping in Gradient Descent Learning , 2007 .

[16]  Yi Lin Multicategory Support Vector Machines, Theory, and Application to the Classification of . . . , 2003 .

[17]  Jason Weston,et al.  Support vector machines for multi-class pattern recognition , 1999, ESANN.

[18]  Tao Sun,et al.  Consistency of Multiclass Empirical Risk Minimization Methods Based on Convex Loss , 2006, J. Mach. Learn. Res..

[19]  G. Wahba,et al.  A Correspondence Between Bayesian Estimation on Stochastic Processes and Smoothing by Splines , 1970 .

[20]  Sara van de Geer,et al.  A Moment Bound for Multi-hinge Classifiers , 2008 .

[21]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[22]  Yann Guermeur,et al.  VC Theory of Large Margin Multi-Category Classifiers , 2007, J. Mach. Learn. Res..

[23]  Arnaud Doucet,et al.  A Framework for Kernel-Based Multi-Category Classification , 2007, J. Artif. Intell. Res..

[24]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[25]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[26]  Tong Zhang Statistical behavior and consistency of classification methods based on convex risk minimization , 2003 .

[27]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[28]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[29]  K. Lange,et al.  Multicategory vertex discriminant analysis for high-dimensional data , 2010, 1101.0952.

[30]  Charles A. Micchelli,et al.  On Learning Vector-Valued Functions , 2005, Neural Computation.

[31]  Tong Zhang,et al.  Statistical Analysis of Some Multi-Category Large Margin Classification Methods , 2004, J. Mach. Learn. Res..

[32]  Mark D. Reid,et al.  Composite Multiclass Losses , 2011, J. Mach. Learn. Res..