Classifier Combining: Analytical Results and Implications

Several researchers have experimentally shown that substantial improvements can be obtained in diicult pattern recognition problems by combining or integrating the outputs of multiple classiiers. This paper summarizes our recent theoretical results that quantify the improvements due to multiple classiier combining. Furthermore, we present an extension of this theory that leads to an estimate of the Bayes error rate. Practical aspects such as expressing the con-dences in decisions and determining the best data partition/classiier selection are also discussed.

[1]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[2]  M. Singh,et al.  An Evidential Reasoning Approach for Multiple-Attribute Decision Making with Uncertainty , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[3]  F. D. Garber,et al.  The Quality of Training Sample Estimates of the Bhattacharyya Coefficient , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Josef Skrzypek,et al.  Synergy of Clustering Multiple Back Propagation Networks , 1989, NIPS.

[5]  Anil K. Jain,et al.  Bootstrap Techniques for Error Estimation , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  L. Cooper,et al.  When Networks Disagree: Ensemble Methods for Hybrid Neural Networks , 1992 .

[7]  Kagan Tumer,et al.  Structural adaptation and generalization in supervised feed-forward networks , 1994 .

[8]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[9]  Richard Lippmann,et al.  Neural Network Classifiers Estimate Bayesian a posteriori Probabilities , 1991, Neural Computation.

[10]  Johannes R. Sveinsson,et al.  Parallel consensual neural networks , 1997, IEEE Trans. Neural Networks.

[11]  Galina L. Rogova,et al.  Combining the results of several neural network classifiers , 1994, Neural Networks.

[12]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[13]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[14]  Alexander H. Waibel,et al.  The Meta-Pi Network: Building Distributed Knowledge Representations for Robust Multisource Pattern Recognition , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Robert J. Schalkoff,et al.  Pattern recognition - statistical, structural and neural approaches , 1991 .

[16]  Sholom M. Weiss,et al.  Computer Systems That Learn , 1990 .

[17]  F.D. Garber,et al.  Bounds on the Bayes Classification Error Based on Pairwise Risk Functions , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[19]  J. Mesirov,et al.  Hybrid system for protein secondary structure prediction. , 1992, Journal of molecular biology.

[20]  John D. Lowrance,et al.  An Inference Technique for Integrating Knowledge from Disparate Sources , 1981, IJCAI.

[21]  Jeffrey A. Barnett,et al.  Computational Methods for a Mathematical Theory of Evidence , 1981, IJCAI.

[22]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[23]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[24]  Jerome H. Friedman,et al.  An Overview of Predictive Learning and Function Approximation , 1994 .

[25]  Kagan Tumer,et al.  Limits to performance gains in combined neural classifiers , 1995 .

[26]  Bhagavatula Vijaya Kumar,et al.  Learning ranks with neural networks , 1995, SPIE Defense + Commercial Sensing.

[27]  Joydeep Ghosh,et al.  Integration Of Neural Classifiers For Passive Sonar Signals , 1996 .

[28]  B. Efron The jackknife, the bootstrap, and other resampling plans , 1987 .

[29]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Sherif Hashem Bruce Schmeiser Approximating a Function and its Derivatives Using MSE-Optimal Linear Combinations of Trained Feedfo , 1993 .

[31]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[32]  A. E. Sarhan,et al.  Estimation of Location and Scale Parameters by Order Statistics from Singly and Doubly Censored Samples , 1956 .

[33]  Kagan Tumer,et al.  Analysis of decision boundaries in linearly combined neural classifiers , 1996, Pattern Recognit..

[34]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[35]  Robert A. Jacobs,et al.  Methods For Combining Experts' Probability Assessments , 1995, Neural Computation.