Constructing ECOC based on confusion matrix for multiclass learning problems

In the pattern recognition field, error-correcting output codes (ECOC) are a powerful tool to fuse any number of binary classifiers to model multiclass problems, and the research of encoding based on data is attracting more and more attention. In this paper, we are going to propose a new encoding method for constructing subclass Error-Correcting Output Codes, which was first introduced by Escalera et al. To achieve this goal, we first obtain the correlation between each pair of classes with the help of confusion matrix. Then, we select the most easily separated subclasses for classification by following Fisher’s principle. At last, we were able to obtain binary partitions based on subclasses. After finishing this work, a new data-driven coding matrix-Subclass ECOC will be achieved. Experimental results on University of CaliforniaIrvine data sets and three kinds of high resolution range profile data sets with logistic linear classifier and support vector machine as the binary classifiers show that our approach can provide a better performance and the robustness of classification with a little longer but acceptable code length.创新点提出一种新型基于数据集构造纠错输出编码解决多类分类问题策略,该方法首先利用混淆矩阵做为衡量多类样本空间中不同类别之间的相似度大小,进而得到类间离散度。其次,基于类间离散度寻找最优子空间划分并得到二类子空间划分集。最后,优化所得到的多个二类子空间(合并和拆分)并最终形成纠错输出编码,基于此编码矩阵划分样本空间集并训练即可得到最优二类分类器,最终有效提高多类分类的准确性和泛化能力。

[1]  Nicolás García-Pedrajas,et al.  Improving multiclass pattern recognition by the combination of two strategies , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Atr State,et al.  An Adaptive Classification Method of BP-NN Group Based Classification System and Its Application , 2001 .

[3]  Ching Y. Suen,et al.  Unconstrained numeral pair recognition using enhanced error correcting output coding: a holistic approach , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[4]  Terry Windeatt,et al.  Boosted ECOC ensembles for face recognition , 2003 .

[5]  Sergio Escalera,et al.  On the Decoding Process in Ternary Error-Correcting Output Codes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[7]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Jordi Vitrià,et al.  Discriminant ECOC: a heuristic method for application dependent design of error correcting output codes , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  David Masip,et al.  Online error correcting output codes , 2011, Pattern Recognit. Lett..

[10]  Dewen Hu,et al.  Globally Consistent Reconstruction of Ripped-Up Documents , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Harald Ruda,et al.  Framework for Automatic Target Recognition Optimization , 1997 .

[12]  Wolfgang Utschick,et al.  Stochastic Organization of Output Codes in Multiclass Learning Problems , 2001, Neural Computation.

[13]  Claudio Marrocco,et al.  Design of reject rules for ECOC classification systems , 2012, Pattern Recognit..

[14]  Zhang Yu-xi HRRP recognition for polarization radar based on Bagging-SVM dynamic ensemble , 2012 .

[15]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[16]  Terry Windeatt,et al.  Weighted Decoding ECOC for Facial Action Unit Classification , 2009, Applications of Supervised and Unsupervised Ensemble Methods.

[17]  Shahar Mendelson,et al.  On the Size of Convex Hulls of Small Sets , 2002, J. Mach. Learn. Res..

[18]  Rayid Ghani,et al.  Combining labeled and unlabeled data for text classification with a large number of categories , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[19]  Ethem Alpaydin,et al.  Learning error-correcting output codes from data , 1999 .

[20]  Nicolás García-Pedrajas,et al.  An empirical study of binary classifier fusion methods for multiclass classification , 2011, Inf. Fusion.

[21]  Sergio Escalera,et al.  Subclass Problem-Dependent Design for Error-Correcting Output Codes , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[23]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..