Single and Multi-Objective Genetic Algorithms for the Selection of Ensemble of Classifiers

Many recent works have investigated methods to select subsets of classifiers instead of combining all available classifiers. The majority of these works has concluded that the combiner error rate is better than diversity to guide the selection process in order to identify the best performing subset of classifiers. However, the classifier selection process has to take into account three different aspects: complexity, overfitting and performance. These aspects of the selection process have not yet been tackled simultaneously in the literature. The study presented in this paper, deals with these three aspects in a handwritten digit recognition problem. Different search criteria such as diversity, error rate and number of classifiers are applied in single and multi-objective optimization approaches using genetic algorithms. In our experiments, we observed that error rate applied in a single optimization approach was the best objective function to increase performance. The generalized diversity and interrater agreement measures, combined with error rate in pairs of objective functions were the best measures to reduce complexity and keep good performance in a multi-objective optimization approach. Finally, the performance of the solutions found in both, single and multi-objective optimization processes were increased by applying a global validation method to reduce overfitting.

[1]  Mykola Pechenizkiy,et al.  Diversity in search strategies for ensemble feature selection , 2005, Inf. Fusion.

[2]  Tin Kam Ho,et al.  Nearest Neighbors in Random Subspaces , 1998, SSPR/SPR.

[3]  Luiz Eduardo Soares de Oliveira,et al.  Automatic Recognition of Handwritten Numerical Strings: A Recognition and Verification Strategy , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Bogdan Gabrys,et al.  Classifier selection for majority voting , 2005, Inf. Fusion.

[6]  Padraig Cunningham,et al.  Using Diversity in Preparing Ensembles of Classifiers Based on Different Feature Subsets to Minimize Generalization Error , 2001, ECML.

[7]  Michael C. Fairhurst,et al.  An evolutionary algorithm for classifier and combination rule selection in multiple classifier systems , 2002, Object recognition supported by user interaction for service robots.

[8]  Patrick J. Grother,et al.  NIST Special Database 19 Handprinted Forms and Characters Database , 1995 .

[9]  Fabio Roli,et al.  An approach to the automatic design of multiple classifier systems , 2001, Pattern Recognit. Lett..

[10]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[11]  Robert Sabourin,et al.  Optimizing nearest neighbour in random subspaces using a multi-objective genetic algorithm , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..