Combining diversity measures for ensemble pruning

Presents an ensemble pruning method (DivP) that combines diversity measures.The combination of diversity measures is advantageous to prune pool of classifiers.DivP uses graph algorithms as a tool to group similar classifiers.DivP obtains better results than literature methods, such as: AGOB, DREP and GASEN.DivP generates smaller final ensemble than state-of-the-art methods. Multiple Classifier Systems (MCSs) have been widely used in the area of pattern recognition due to the difficult task that is to find a single classifier that has a good performance on a great variety of problems. Studies have shown that MCSs generate a large quantity of classifiers and that those classifiers have redundancy between each other. Various methods proposed to decrease the number of classifiers without worsening the performance of the ensemble succeeded when using diversity to drive the pruning process. In this work we propose a pruning method that combines different pairwise diversity matrices through a genetic algorithm. The combined diversity matrix is then used to group similar classifiers, i.e., those with low diversity, that should not belong to the same ensemble. In order to generate candidate ensembles, we transform the combined diversity matrix into one or more graphs and then apply a graph coloring method. The proposed method was assessed on 21 datasets from the UCI Machine Learning Repository and its results were compared with five state-of-the-art techniques in ensemble pruning. Results have shown that the proposed pruning method obtains smaller ensembles than the state-of-the-art techniques while improving the recognition rates.

[1]  Daniel Hernández-Lobato,et al.  An Analysis of Ensemble Pruning Techniques Based on Ordered Aggregation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  San Cristóbal Mateo,et al.  The Lack of A Priori Distinctions Between Learning Algorithms , 1996 .

[3]  Rich Caruana,et al.  Ensemble selection from libraries of models , 2004, ICML.

[4]  Thomas G. Dietterich,et al.  Pruning Adaptive Boosting , 1997, ICML.

[5]  Yang Yu,et al.  Diversity Regularized Ensemble Pruning , 2012, ECML/PKDD.

[6]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[7]  Mykola Pechenizkiy,et al.  Diversity in search strategies for ensemble feature selection , 2005, Inf. Fusion.

[8]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[9]  Xin Yao,et al.  An analysis of diversity measures , 2006, Machine Learning.

[10]  William B. Yates,et al.  Engineering Multiversion Neural-Net Systems , 1996, Neural Computation.

[11]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[12]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[13]  Fabio Roli,et al.  Design of effective multiple classifier systems by clustering of classifiers , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[14]  Luiz Eduardo Soares de Oliveira,et al.  Feature selection for ensembles applied to handwriting recognition , 2006, International Journal of Document Analysis and Recognition (IJDAR).

[15]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[16]  Fabio Roli,et al.  Design of effective neural network ensembles for image classification purposes , 2001, Image Vis. Comput..

[17]  Luiz Eduardo Soares de Oliveira,et al.  Pairwise fusion matrix for combining classifiers , 2007, Pattern Recognit..

[18]  Robert P. W. Duin,et al.  Limits on the majority vote accuracy in classifier fusion , 2003, Pattern Analysis & Applications.

[19]  Christino Tamon,et al.  On the Boosting Pruning Problem , 2000, ECML.

[20]  David H. Wolpert,et al.  The Lack of A Priori Distinctions Between Learning Algorithms , 1996, Neural Computation.

[21]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[22]  Gonzalo Martínez-Muñoz,et al.  Pruning in ordered bagging ensembles , 2006, ICML.