Constraint Classification for Multiclass Classification and Ranking

The constraint classification framework captures many flavors of multiclass classification including winner-take-all multiclass classification, multilabel classification and ranking. We present a meta-algorithm for learning in this framework that learns via a single linear classifier in high dimension. We discuss distribution independent as well as margin-based generalization bounds and present empirical and theoretical evidence showing that constraint classification benefits over existing methods of multiclass classification.

[1]  Nils J. Nilsson,et al.  Learning Machines: Foundations of Trainable Pattern-Classifying Systems , 1965 .

[2]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[3]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[4]  Shai Ben-David,et al.  Characterizations of learnability for classes of {O, …, n}-valued functions , 1992, COLT '92.

[5]  Eric Brill,et al.  Some Advances in Transformation-Based Part of Speech Tagging , 1994, AAAI.

[6]  Sholom M. Weiss,et al.  Automated learning of decision rules for text categorization , 1994, TOIS.

[7]  Philip M. Long,et al.  Characterizations of Learnability for Classes of {0, ..., n}-Valued Functions , 1995, J. Comput. Syst. Sci..

[8]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[9]  H. Sebastian Seung,et al.  Unsupervised Learning by Convex and Conic Coding , 1996, NIPS.

[10]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[11]  Ido Dagan,et al.  Mistake-Driven Learning in Text Categorization , 1997, EMNLP.

[12]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[13]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[14]  Dan Roth,et al.  Learning to Resolve Natural Language Ambiguities: A Unified Approach , 1998, AAAI/IAAI.

[15]  Z. Yao A review of: “Fundamentals of Interfacial Engineering” by R. J. Stokes and D. F. Evans Wiley-VCH 605 Third Avenue New York, NY 10158-0012 , 1998 .

[16]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[17]  Jason Weston,et al.  Support vector machines for multi-class pattern recognition , 1999, ESANN.

[18]  Peter L. Bartlett,et al.  Learning in Neural Networks: Theoretical Foundations , 1999 .

[19]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[20]  Wolfgang Maass,et al.  On the Computational Power of Winner-Take-All , 2000, Neural Computation.

[21]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[22]  Dan Roth,et al.  A Sequential Model for Multi-Class Classification , 2001, EMNLP.

[23]  Dan Roth,et al.  Constraint Classification: A New Approach to Multiclass Classification , 2002, ALT.

[24]  Koby Crammer,et al.  Ultraconservative Online Algorithms for Multiclass Problems , 2001, J. Mach. Learn. Res..

[25]  Koby Crammer,et al.  On the Learnability and Design of Output Codes for Multiclass Problems , 2002, Machine Learning.