A novel Error-Correcting Output Codes algorithm based on genetic programming

Abstract Error-Correcting Output Codes (ECOC) is widely used in the field of multiclass classification. As an optimal codematrix is key to the performance of an ECOC algorithm, this paper proposes a genetic programming (GP) based ECOC algorithm (GP-ECOC). In the design of individual of our GP, each terminal node represents a class, and nonterminal nodes combine the classes in their child nodes. In this way, an individual is a class combination tree, and represents an ECOC codematrix. A legality checking process is embedded in our algorithm to check each codematrix, so as to ensure each codematrix satisfying ECOC constraints. Those violating the constraints will be corrected by a proposed Guided Mutation operator. Before fitness evaluation, a local optimization algorithm is proposed to append new columns for tough classes, so as to improve the generalization ability of each individual and accelerate the evolutionary speed. In this way, our GP can evolve optimal codematrices through the evolutionary process. Experiments show that compared with other ensemble algorithms, our algorithm can achieve stable and high performances with relatively small ensemble scales on various UCI data sets. To the best of our knowledge, it is the first time that GP has been applied to implement the ECOC encoding algorithm. Our Python code is available at https://github.com/samuellees/gpecoc.

[1]  Esmaeil Hadavandi,et al.  MBCGP-FE: A modified balanced cartesian genetic programming feature extractor , 2017, Knowl. Based Syst..

[2]  Marco Aurélio Cavalcanti Pacheco,et al.  Solving stochastic differential equations through genetic programming and automatic differentiation , 2018, Eng. Appl. Artif. Intell..

[3]  Nicolás García-Pedrajas,et al.  Evolving Output Codes for Multiclass Problems , 2008, IEEE Transactions on Evolutionary Computation.

[4]  Leonardo Vanneschi,et al.  Multidimensional genetic programming for multiclass classification , 2019, Swarm Evol. Comput..

[5]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[6]  Zhi-Hua Zhou,et al.  Learning Imbalanced Multi-class Data with Optimal Dichotomy Weights , 2013, 2013 IEEE 13th International Conference on Data Mining.

[7]  Marcos André Gonçalves,et al.  A Genetic Programming approach for feature selection in highly dimensional skewed data , 2018, Neurocomputing.

[8]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[9]  Jordi Vitrià,et al.  Discriminant ECOC: a heuristic method for application dependent design of error correcting output codes , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Jordi Vitrià,et al.  Minimal design of error-correcting output codes , 2012, Pattern Recognit. Lett..

[11]  Mohammad Shahram Moin,et al.  A discriminant binarization transform using genetic algorithm and error-correcting output code for face template protection , 2019, Int. J. Mach. Learn. Cybern..

[12]  Jordi Vitrià,et al.  Traffic Sign Recognition Using Evolutionary Adaboost Detection and Forest-ECOC Classification , 2009, IEEE Transactions on Intelligent Transportation Systems.

[13]  HanRui Wang,et al.  A Genetic Programming Based ECOC Algorithm for Microarray Data Classification , 2017, ICONIP.

[14]  R. Iman,et al.  Approximations of the critical region of the fbietkan statistic , 1980 .

[15]  Sergio Escalera,et al.  On the Decoding Process in Ternary Error-Correcting Output Codes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Muhammad Tariq Mahmood,et al.  Adaptive outlier elimination in image registration using genetic programming , 2017, Inf. Sci..

[17]  Mohammad Shahram Moin,et al.  Securing templates in a face recognition system using Error-Correcting Output Code and chaos theory , 2018, Comput. Electr. Eng..

[18]  Xiao-Lei Zhang,et al.  Heuristic Ternary Error-Correcting Output Codes Via Weight Optimization and Layered Clustering-Based Approach , 2013, IEEE Transactions on Cybernetics.

[19]  Ligang Zhou,et al.  One versus one multi-class classification fusion using optimizing decision directed acyclic graph for predicting listing status of companies , 2017, Inf. Fusion.

[20]  Bingbing Ni,et al.  Zero-Shot Action Recognition with Error-Correcting Output Codes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Chen Xi Research on synthetic aperture radar image target recognition based on AdaBoost.ECOC , 2010 .

[23]  Nikhil R. Pal,et al.  A Multiobjective Genetic Programming-Based Ensemble for Simultaneous Feature Selection and Classification , 2016, IEEE Transactions on Cybernetics.

[24]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.

[25]  Abdul Majid,et al.  Optimization of Classifiers using Genetic Programming , 2006 .

[26]  Fei-Yue Wang,et al.  Data-Driven Intelligent Transportation Systems: A Survey , 2011, IEEE Transactions on Intelligent Transportation Systems.

[27]  Chun-Gui Xu,et al.  A genetic programming-based approach to the classification of multiclass microarray datasets , 2009, Bioinform..

[28]  Christian S. Perone,et al.  Pyevolve: a Python open-source framework for genetic algorithms , 2009, SEVO.

[29]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[30]  Vincent T. Y. Ng,et al.  A Hierarchical Ensemble of ECOC for cancer classification based on multi-class microarray data , 2016, Inf. Sci..

[31]  Thomas A. Runkler,et al.  Interpretable Policies for Reinforcement Learning by Genetic Programming , 2017, Eng. Appl. Artif. Intell..

[32]  Xiao-Na Ye,et al.  A Novel Genetic Algorithm Based ECOC Algorithm , 2018, 2018 14th International Conference on Semantics, Knowledge and Grids (SKG).

[33]  Kunhong Liu,et al.  A Novel ECOC Algorithm with Centroid Distance Based Soft Coding Scheme , 2018, ICIC.

[34]  Emanuel Aldea,et al.  Evidential framework for Error Correcting Output Code classification , 2018, Eng. Appl. Artif. Intell..

[35]  P. N. Suganthan,et al.  Benchmarking Ensemble Classifiers with Novel Co-Trained Kernal Ridge Regression and Random Vector Functional Link Ensembles [Research Frontier] , 2017, IEEE Computational Intelligence Magazine.

[36]  Sergio Escalera,et al.  ECOC-ONE: A Novel Coding and Decoding Strategy , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[37]  Leonardo Vanneschi,et al.  A semi-supervised Genetic Programming method for dealing with noisy labels and hidden overfitting , 2017, Swarm Evol. Comput..

[38]  Francisco Herrera,et al.  Dynamic ensemble selection for multi-class classification with one-class classifiers , 2018, Pattern Recognit..

[39]  Dianhui Wang,et al.  A comprehensive survey on genetic algorithms for DNA motif prediction , 2018, Inf. Sci..

[40]  Claudio Marrocco,et al.  Design of reject rules for ECOC classification systems , 2012, Pattern Recognit..

[41]  Koby Crammer,et al.  On the Learnability and Design of Output Codes for Multiclass Problems , 2002, Machine Learning.

[42]  Josef Kittler,et al.  BeamECOC: A local search for the optimization of the ECOC matrix , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[43]  Francisco Herrera,et al.  A Survey on the Application of Genetic Programming to Classification , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[44]  Sergio Escalera,et al.  A genetic-based subspace analysis method for improving Error-Correcting Output Coding , 2013, Pattern Recognit..

[45]  Yun Yang,et al.  Constructing ECOC based on confusion matrix for multiclass learning problems , 2015, Science China Information Sciences.

[46]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[47]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Evolutionary design of multiclass support vector machines , 2007, J. Intell. Fuzzy Syst..

[48]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[49]  Muchenxuan Tong,et al.  Genetic Programming Based Ensemble System for Microarray Data Classification , 2015, Comput. Math. Methods Medicine.

[50]  Will N. Browne,et al.  Image feature selection using genetic programming for figure-ground segmentation , 2017, Eng. Appl. Artif. Intell..

[51]  Heder S. Bernardino,et al.  Knowledge discovery in multiobjective optimization problems in engineering via Genetic Programming , 2018, Expert Syst. Appl..

[52]  Luis Muñoz,et al.  Evolving genetic programming classifiers with novelty search , 2016, Inf. Sci..

[53]  Sergio Escalera,et al.  An incremental node embedding technique for error correcting output codes , 2008, Pattern Recognit..

[54]  Chongsheng Zhang,et al.  An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme , 2018, Knowl. Based Syst..

[55]  Antonis Alexandridis,et al.  Stochastic model genetic programming: Deriving pricing equations for rainfall weather derivatives , 2019, Swarm Evol. Comput..

[56]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[57]  Claudio De Stefano,et al.  Using Bayesian networks for selecting classifiers in GP ensembles , 2014, Inf. Sci..

[58]  Ching Y. Suen,et al.  Data-driven decomposition for multi-class classification , 2008, Pattern Recognit..

[59]  Kunhong Liu,et al.  A New ECOC Algorithm for Multiclass Microarray Data Classification , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[60]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.