A framework for dynamic classifier selection oriented by the classification problem difficulty

Abstract This paper describes a framework for Dynamic Classifier Selection (DCS) whose novelty resides in its use of features that address the difficulty posed by the classification problem in terms of orienting both pool generation and classifier selection. The classification difficulty is described by meta-features estimated from problem data using complexity measures. Firstly, these features are used to drive the classifier pool generation expecting a better coverage of the problem space, and then, a dynamic classifier selection based on similar features estimates the ability of the classifiers to deal with the test instance. The rationale here is to dynamically select a classifier trained on a subproblem (training subset) having a similar level of difficulty as that observed in the neighborhood of the test instance defined in a validation set. A robust experimental protocol based on 30 datasets, and considering 20 replications, has confirmed that a better understanding of the classification problem difficulty may positively impact the performance of a DCS. For the pool generation method, it was observed that in 126 of 180 experiments (70.0%) adopting the proposed pool generator allowed an improvement of the accuracy of the evaluated DCS methods. In addition, the main results from the proposed framework, in which pool generation and classifier selection are both based on problem difficulty features, are very promising. In 165 of 180 experiments (91.6%), it was also observed that the proposed DCS framework based on the problem difficulty achieved a better classification accuracy when compared to 6 well known DCS methods in the literature.

[1]  Tin Kam Ho,et al.  Learner excellence biased by data set selection: A case for data characterisation and artificial data sets , 2013, Pattern Recognit..

[2]  Tin Kam Ho,et al.  Complexity Measures of Supervised Classification Problems , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Jin Xiao,et al.  Dynamic Classifier Ensemble Selection Based on GMDH , 2009, 2009 International Joint Conference on Computational Sciences and Optimization.

[4]  Tin Kam Ho,et al.  Measures of Geometrical Complexity in Classification Problems , 2006 .

[5]  Chun Yang,et al.  Sorting-Based Dynamic Classifier Ensemble Selection , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[6]  Fabio Roli,et al.  Selection of Classifiers Based on Multiple Classifier Behaviour , 2000, SSPR/SPR.

[7]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[8]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[9]  Wutthipong Chinnasri,et al.  Comparison of performance between different selection strategies on genetic algorithm with course timetabling problem , 2010, 2010 IEEE International Conference on Advanced Management Science(ICAMS 2010).

[10]  Robert Sabourin,et al.  From dynamic classifier selection to dynamic ensemble selection , 2008, Pattern Recognit..

[11]  Amar Mitiche,et al.  Classifier combination for hand-printed digit recognition , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[12]  Cao Feng,et al.  STATLOG: COMPARISON OF CLASSIFICATION ALGORITHMS ON LARGE REAL-WORLD PROBLEMS , 1995 .

[13]  George D. C. Cavalcanti,et al.  META-DES: A dynamic ensemble selection framework using meta-learning , 2015, Pattern Recognit..

[14]  Kevin W. Bowyer,et al.  Combination of multiple classifiers using local accuracy estimates , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Luiz Eduardo Soares de Oliveira,et al.  Contribution of data complexity features on dynamic classifier selection , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[16]  Bernd Bischl,et al.  To tune or not to tune: Recommending when to adjust SVM hyper-parameters via meta-learning , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[17]  Luiz Eduardo Soares de Oliveira,et al.  Dynamic selection of classifiers - A comprehensive review , 2014, Pattern Recognit..

[18]  Anne M. P. Canuto,et al.  A Dynamic Classifier Selection Method to Build Ensembles using Accuracy and Diversity , 2006, 2006 Ninth Brazilian Symposium on Neural Networks (SBRN'06).

[19]  Robert Sabourin,et al.  Ambiguity-guided dynamic selection of ensemble of classifiers , 2007, 2007 10th International Conference on Information Fusion.

[20]  George D. C. Cavalcanti,et al.  META-DES.Oracle: Meta-learning and feature selection for dynamic ensemble selection , 2017, Inf. Fusion.

[21]  Juan José Rodríguez Diez,et al.  Classifier Ensembles with a Random Linear Oracle , 2007, IEEE Transactions on Knowledge and Data Engineering.

[22]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[23]  Núria Macià,et al.  In search of targeted-complexity problems , 2010, GECCO '10.

[24]  Nojun Kwak,et al.  Feature extraction for classification problems and its application to face recognition , 2008, Pattern Recognit..

[25]  P. T. Szymanski,et al.  Adaptive mixtures of local experts are source coding solutions , 1993, IEEE International Conference on Neural Networks.

[26]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Fabio Roli,et al.  Methods for dynamic classifier selection , 1999, Proceedings 10th International Conference on Image Analysis and Processing.

[28]  Gian Luca Marcialis,et al.  A study on the performances of dynamic classifier selection based on local accuracy estimation , 2005, Pattern Recognit..

[29]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..