Overfitting cautious selection of classifier ensembles with genetic algorithms

Information fusion research has recently focused on the characteristics of the decision profiles of ensemble members in order to optimize performance. These characteristics are particularly important in the selection of ensemble members. However, even though the control of overfitting is a challenge in machine learning problems, much less work has been devoted to the control of overfitting in selection tasks. The objectives of this paper are: (1) to show that overfitting can be detected at the selection stage; and (2) to present strategies to control overfitting. Decision trees and k nearest neighbors classifiers are used to create homogeneous ensembles, while single- and multi-objective genetic algorithms are employed as search algorithms at the selection stage. In this study, we use bagging and random subspace methods for ensemble generation. The classification error rate and a set of diversity measures are applied as search criteria. We show experimentally that the selection of classifier ensembles conducted by genetic algorithms is prone to overfitting, especially in the multi-objective case. In this study, the partial validation, backwarding and global validation strategies are tailored for classifier ensemble selection problem and compared. This comparison allows us to show that a global validation strategy should be applied to control overfitting in pattern recognition systems involving an ensemble member selection task. Furthermore, this study has helped us to establish that the global validation strategy can be used to measure the relationship between diversity and classification performance when diversity measures are employed as single-objective functions.

[1]  Patrick J. Grother,et al.  NIST Special Database 19 Handprinted Forms and Characters Database , 1995 .

[2]  Anand M. Narasimhamurthy Evaluation of Diversity Measures for Binary Classifier Ensembles , 2005, Multiple Classifier Systems.

[3]  Mykola Pechenizkiy,et al.  Sequential Genetic Search for Ensemble Feature Selection , 2005, IJCAI.

[4]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[5]  Padraig Cunningham,et al.  Using Diversity in Preparing Ensembles of Classifiers Based on Different Feature Subsets to Minimize Generalization Error , 2001, ECML.

[6]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[7]  Jorma Laaksonen,et al.  Using diversity of errors for selecting members of a committee classifier , 2006, Pattern Recognit..

[8]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[9]  Robert Sabourin,et al.  Pareto analysis for the selection of classifier ensembles , 2008, GECCO '08.

[10]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[11]  Robert Sabourin,et al.  Classification system optimization with multi-objective genetic algorithms , 2006 .

[12]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[14]  Eulanda M. dos Santos Static and dynamic overproduction and selection of classifier ensembles with genetic algorithms , 2008 .

[15]  Robert Sabourin,et al.  An Evaluation of Over-Fit Control Strategies for Multi-Objective Evolutionary Optimization , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[16]  Noel E. Sharkey,et al.  The "Test and Select" Approach to Ensemble Combination , 2000, Multiple Classifier Systems.

[17]  R. K. Ursem Multi-objective Optimization using Evolutionary Algorithms , 2009 .

[18]  William B. Yates,et al.  Engineering Multiversion Neural-Net Systems , 1996, Neural Computation.

[19]  Nadia Benahmed,et al.  Optimisation de réseaux de neurones pour la reconnaissance de chiffres manuscrits isolés : sélection et pondération des primitives par algorithmes génétiques , 2002 .

[20]  Bernhard Schölkopf,et al.  Feature selection for support vector machines by means of genetic algorithm , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[21]  Odile Papini,et al.  Information Fusion , 2014, Computer Vision, A Reference Guide.

[22]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Terry Windeatt,et al.  Over-Fitting in Ensembles of Neural Network Classifiers Within ECOC Frameworks , 2005, Multiple Classifier Systems.

[24]  Michael C. Fairhurst,et al.  An evolutionary algorithm for classifier and combination rule selection in multiple classifier systems , 2002, Object recognition supported by user interaction for service robots.

[25]  Marco Laumanns,et al.  Performance assessment of multiobjective optimizers: an analysis and review , 2003, IEEE Trans. Evol. Comput..

[26]  Padraig Cunningham,et al.  Overfitting in Wrapper-Based Feature Subset Selection: The Harder You Try the Worse it Gets , 2004, SGAI Conf..

[27]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[28]  Gunnar Rätsch,et al.  An Improvement of AdaBoost to Avoid Overfitting , 1998, ICONIP.

[29]  Ron Kohavi,et al.  Feature Subset Selection Using the Wrapper Method: Overfitting and Dynamic Search Space Topology , 1995, KDD.

[30]  Cyril Fonlupt,et al.  Backwarding : An Overfitting Control for Genetic Programming in a Remote Sensing Application , 2001, Artificial Evolution.

[31]  David E. Goldberg,et al.  Accuracy, Parsimony, and Generality in Evolutionary Learning Systems via Multiobjective Selection , 2002, IWLCS.

[32]  Hao Wu,et al.  Does overfitting affect performance in estimation of distribution algorithms , 2006, GECCO.

[33]  G DietterichThomas An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees , 2000 .

[34]  Luiz Eduardo Soares de Oliveira,et al.  Overfitting in the selection of classifier ensembles: a comparative study between PSO and GA , 2008, GECCO '08.

[35]  Bogdan Gabrys,et al.  Classifier selection for majority voting , 2005, Inf. Fusion.

[36]  Sanghamitra Bandyopadhyay,et al.  Multiobjective GAs, quantitative indices, and pattern classification , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[37]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[38]  Luiz Eduardo Soares de Oliveira,et al.  Automatic Recognition of Handwritten Numerical Strings: A Recognition and Verification Strategy , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Christian Igel,et al.  Evolutionary Multi-Objective Optimisation Of Neural Networks For Face Detection , 2004, Int. J. Comput. Intell. Appl..

[40]  Robert Sabourin,et al.  Optimizing nearest neighbour in random subspaces using a multi-objective genetic algorithm , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[41]  Shapour Azarm,et al.  Metrics for Quality Assessment of a Multiobjective Design Optimization Solution Set , 2001 .