Joint learning of error-correcting output codes and dichotomizers from data

The ECOC technique is a powerful tool to learn and combine multiple binary learners for multi-class clas- sification. It generally involves three steps: coding, di- chotomizers learning, and decoding. In previous ECOC methods, the coding step and the dichotomizers learning step are usually performed independently. This simplifies the learning problem but may lead to unsatisfactory decoding results. To solve this problem, we propose a novel model for learning the ECOC matrix and dichot- omizers jointly from data. We formulate the model as a nonlinear programming problem and develop an efficient alternating minimization algorithm to solve it. Specifically, for fixed ECOC matrix, our model is decomposed into a group of mutually independent quadratic programming problems; while for fixed dichotomizers, it is a difference of convex functions problem and can be easily solved using the concave-convex procedure algorithm. Our experi- mental results on ten data sets from the UCI machine learning repository demonstrated the advantage of our model over state-of-the-art ECOC methods.

[1]  Johannes Fürnkranz,et al.  Round Robin Classification , 2002, J. Mach. Learn. Res..

[2]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[3]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[4]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[5]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[6]  Gert R. G. Lanckriet,et al.  On the Convergence of the Concave-Convex Procedure , 2009, NIPS.

[7]  Ching Y. Suen,et al.  Data-driven decomposition for multi-class classification , 2008, Pattern Recognit..

[8]  Koby Crammer,et al.  On the Learnability and Design of Output Codes for Multiclass Problems , 2002, Machine Learning.

[9]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[10]  R. Horst,et al.  DC Programming: Overview , 1999 .

[11]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[12]  Sergio Escalera,et al.  An incremental node embedding technique for error correcting output codes , 2008, Pattern Recognit..

[13]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[14]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[15]  Jordi Vitrià,et al.  Discriminant ECOC: a heuristic method for application dependent design of error correcting output codes , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[17]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[18]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[19]  Sergio Escalera,et al.  On the Decoding Process in Ternary Error-Correcting Output Codes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Michael Collins,et al.  Learning Label Embeddings for Nearest-Neighbor Multi-class Classification with an Application to Speech Recognition , 2009, NIPS.

[21]  Hiroshi Sako,et al.  Class-specific feature polynomial classifier for pattern classification and its application to handwritten numeral recognition , 2006, Pattern Recognit..

[22]  Michael R. Lyu,et al.  Maxi–Min Margin Machine: Learning Large Margin Classifiers Locally and Globally , 2008, IEEE Transactions on Neural Networks.

[23]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[24]  Sergio Escalera,et al.  Re-coding ECOCs without re-training , 2010, Pattern Recognit. Lett..

[25]  Wolfgang Utschick,et al.  Stochastic Organization of Output Codes in Multiclass Learning Problems , 2001, Neural Computation.