Online pruning of base classifiers for Dynamic Ensemble Selection

Abstract Dynamic Ensemble Selection (DES) techniques aim to select only the most competent classifiers for the classification of each test sample. The key issue in DES is how to estimate the competence of classifiers for the classification of each new test sample. Most DES techniques estimate the competence of classifiers using a given criterion over the set of nearest neighbors of the test sample in the validation set, these nearest neighbors compose the region of competence. However, using local accuracy criteria alone on the region of competence is not sufficient to accurately estimate the competence of classifiers for the classification of all test samples. When the test sample is located in a region with borderline samples of different classes (indecision region), DES techniques can select classifiers with decision boundaries that do not cross the region of competence, assigning all samples in the region of competence to the same class. In this paper, we propose a dynamic selection framework for two-class problems that detects if a test sample is located in an indecision region and, if so, prunes the pool of classifiers, pre-selecting classifiers with decision boundaries crossing the region of competence of the test sample (if such classifiers exist). After that, the proposed framework uses a DES technique to select the most competent classifiers from the set of pre-selected classifiers. Experiments are conducted using the proposed framework with 9 different dynamic selection approaches on 40 classification datasets. Experimental results show that for all DES techniques used in the framework, the proposed framework outperforms DES in classification accuracy, demonstrating that our proposal significantly improves the classification performance of DES techniques, achieving statistically equivalent classification performance to the current state-of-the-art DES frameworks.

[1]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[2]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[3]  Luiz Eduardo Soares de Oliveira,et al.  Dynamic selection of classifiers - A comprehensive review , 2014, Pattern Recognit..

[4]  Kevin W. Bowyer,et al.  Combination of multiple classifiers using local accuracy estimates , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  George D. C. Cavalcanti,et al.  META-DES: A dynamic ensemble selection framework using meta-learning , 2015, Pattern Recognit..

[6]  Robert Sabourin,et al.  Dynamic selection approaches for multiple classifier systems , 2011, Neural Computing and Applications.

[7]  M. Friedman A Comparison of Alternative Tests of Significance for the Problem of $m$ Rankings , 1940 .

[8]  George D. C. Cavalcanti,et al.  Feature representation selection based on Classifier Projection Space and Oracle analysis , 2013, Expert Syst. Appl..

[9]  George D. C. Cavalcanti,et al.  META-DES.H: A Dynamic Ensemble Selection technique using meta-learning and a dynamic weighting approach , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[10]  Francisco Herrera,et al.  Ordering-based pruning for improving the performance of ensembles of classifiers in the framework of imbalanced datasets , 2016, Inf. Sci..

[11]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[12]  Maneesha Singh,et al.  A dynamic classifier selection and combination approach to image region labelling , 2005, Signal Process. Image Commun..

[13]  Anne M. P. Canuto,et al.  A Dynamic Classifier Selection Method to Build Ensembles using Accuracy and Diversity , 2006, 2006 Ninth Brazilian Symposium on Neural Networks (SBRN'06).

[14]  Francisco Herrera,et al.  An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics , 2013, Inf. Sci..

[15]  Dmitry O. Gorodnichy,et al.  An adaptive ensemble-based system for face recognition in person re-identification , 2015, Machine Vision and Applications.

[16]  Robert P. W. Duin,et al.  Bagging for linear classifiers , 1998, Pattern Recognit..

[17]  Szymon Wilk,et al.  Learning from Imbalanced Data in Presence of Noisy and Borderline Examples , 2010, RSCTC.

[18]  José Martínez Sotoca,et al.  Combined Effects of Class Imbalance and Class Overlap on Instance-Based Classification , 2006, IDEAL.

[19]  Robert Sabourin,et al.  Improving performance of HMM-based off-line signature verification systems through a multi-hypothesis approach , 2010, International Journal on Document Analysis and Recognition (IJDAR).

[20]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[21]  Loris Nanni,et al.  Coupling different methods for overcoming the class imbalance problem , 2015, Neurocomputing.

[22]  George D. C. Cavalcanti,et al.  META-DES.Oracle: Meta-learning and feature selection for dynamic ensemble selection , 2017, Inf. Fusion.

[23]  Tony R. Martinez,et al.  An instance level analysis of data complexity , 2014, Machine Learning.

[24]  Emilio Corchado,et al.  A survey of multiple classifier systems as hybrid systems , 2014, Inf. Fusion.

[25]  Francisco Herrera,et al.  Dealing with Noisy Data , 2015 .

[26]  Siddhartha Bhattacharyya,et al.  Data mining for credit card fraud: A comparative study , 2011, Decis. Support Syst..

[27]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[28]  Robert Sabourin,et al.  From dynamic classifier selection to dynamic ensemble selection , 2008, Pattern Recognit..

[29]  Francisco Herrera,et al.  Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[31]  George D. C. Cavalcanti,et al.  A method for dynamic ensemble selection based on a filter and an adaptive distance to improve the quality of the regions of competence , 2011, IJCNN.

[32]  Fabio Roli,et al.  Dynamic Classifier Selection , 2000, Multiple Classifier Systems.

[33]  Fabio Roli,et al.  Dynamic classifier selection based on multiple classifier behaviour , 2001, Pattern Recognit..

[34]  Marek Kurzynski,et al.  A probabilistic model of classifier competence for dynamic ensemble selection , 2011, Pattern Recognit..

[35]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[36]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[37]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[38]  Nojun Kwak,et al.  Feature extraction for classification problems and its application to face recognition , 2008, Pattern Recognit..

[39]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[40]  Fabio Roli,et al.  Methods for dynamic classifier selection , 1999, Proceedings 10th International Conference on Image Analysis and Processing.

[41]  Robert A. Legenstein,et al.  Combining predictions for accurate recommender systems , 2010, KDD.