On the Diversity-Performance Relationship for Majority Voting in Classifier Ensembles

Combining multiple classifier systems (MCS') has been shown to outperform single classifier system. It has been demonstrated that improvement for ensemble performance depends on either the diversity among or the performance of individual systems. A variety of diversity measures and ensemble methods have been proposed and studied. It remains a challenging problem to estimate the ensemble performance in terms of the performance of and the diversity among individual systems. In this paper, we establish upper and lower bounds for Pm (performance of the ensemble using majority voting) in terms of P (average performance of individual systems) and D (average entropy diversity measure among individual systems). These bounds are shown to be tight using the concept of a performance distribution pattern (PDP) for the input set. Moreover, we showed that when P is big enough, the ensemble performance Pm resulting from a maximum (information-theoretic) entropy PDP is an increasing function with respect to the diversity measure D. Five experiments using data sets from various applications domains are conducted to demonstrate the complexity, richness, and diverseness of the problem in estimating the ensemble performance.

[1]  Robert P. W. Duin,et al.  Limits on the majority vote accuracy in classifier fusion , 2003, Pattern Analysis & Applications.

[2]  Ludmila I. Kuncheva,et al.  That Elusive Diversity in Classifier Ensembles , 2003, IbPRIA.

[3]  Josef Kittler,et al.  Fusion of multiple classifiers , 2002, Inf. Fusion.

[4]  Kwong-Sak Leung,et al.  Adaptive weighted outer-product learning associative memory , 1997, IEEE Trans. Syst. Man Cybern. Part B.

[5]  Ching Y. Suen,et al.  Application of majority voting to pattern recognition: an analysis of its behavior and performance , 1997, IEEE Trans. Syst. Man Cybern. Part A.

[6]  Tin Kam Ho,et al.  MULTIPLE CLASSIFIER COMBINATION: LESSONS AND NEXT STEPS , 2002 .

[7]  D. Frank Hsu,et al.  Combinatorial Fusion Analysis: Methods and Practices of Combining Multiple Scoring Systems , 2006 .

[8]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Ludmila I. Kuncheva,et al.  Fuzzy Classifier Design , 2000, Studies in Fuzziness and Soft Computing.

[10]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[11]  D. Frank Hsu,et al.  Comparing Rank and Score Combination Methods for Data Fusion in Information Retrieval , 2005, Information Retrieval.

[12]  Amanda J. C. Sharkey,et al.  On Combining Artificial Neural Nets , 1996, Connect. Sci..

[13]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[14]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Luc De Raedt,et al.  Machine Learning: ECML 2001 , 2001, Lecture Notes in Computer Science.

[16]  Anand M. Narasimhamurthy Evaluation of Diversity Measures for Binary Classifier Ensembles , 2005, Multiple Classifier Systems.

[17]  Josef Kittler,et al.  Sum Versus Vote Fusion in Multiple Classifier Systems , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Anand M. Narasimhamurthy Theoretical bounds of majority voting performance for a binary classification problem , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Horst Bunke,et al.  Hybrid methods in pattern recognition , 1987 .

[20]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[21]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[22]  M. Kendall Rank Correlation Methods , 1949 .

[23]  Fabio Roli,et al.  A theoretical and experimental analysis of linear combiners for multiple classifier systems , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[25]  Padraig Cunningham,et al.  Using Diversity in Preparing Ensembles of Classifiers Based on Different Feature Subsets to Minimize Generalization Error , 2001, ECML.

[26]  Hui-Huang Hsu,et al.  Advanced Data Mining Technologies in Bioinformatics , 2006 .

[27]  Ludmila I. Kuncheva,et al.  Relationships between combination methods and measures of diversity in combining classifiers , 2002, Inf. Fusion.

[28]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Ludmila I. Kuncheva Diversity in multiple classifier systems , 2005, Inf. Fusion.

[30]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[31]  M. Kendall,et al.  Rank Correlation Methods , 1949 .

[32]  Amanda J. C. Sharkey,et al.  Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems , 1999 .

[33]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[34]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[35]  Kagan Tumer,et al.  Linear and Order Statistics Combiners for Pattern Classification , 1999, ArXiv.

[36]  Jorma Laaksonen,et al.  Using diversity of errors for selecting members of a committee classifier , 2006, Pattern Recognit..