Boundary variance reduction for improved classification through hybrid networks

Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This paper provides an analytical framework that quantifies the improvements in classification results due to linear combining. We show that combining networks in output space reduces the variance of the actual decision region boundaries around the optimum boundary. In the absence of network bias, the added classification error is directly proportional to the boundary variance. Moreover, if the network errors are independent, then the reduction in variance boundary location is by a factor of N, the number of classifiers that are combined. In the presence of network bias, the reductions are less than or equal to N, depending on the interaction between network biases. We discuss how the individual networks can be selected to achieve significant gains through combining, and support them with experimental results on 25-dimensional sonar data. The analysis presented here facilitates the understanding of the relationships among error rates, classifier boundary distributions, and combining in output space.

[1]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[2]  Esther Levin,et al.  A statistical approach to learning and generalization in layered neural networks , 1989, COLT '89.

[3]  Galina L. Rogova,et al.  Combining the results of several neural network classifiers , 1994, Neural Networks.

[4]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[5]  Joydeep Ghosh,et al.  Evidence combination techniques for robust classification of short-duration oceanic signals , 1992, Defense, Security, and Sensing.

[6]  Sherif Hashem Bruce Schmeiser Approximating a Function and its Derivatives Using MSE-Optimal Linear Combinations of Trained Feedfo , 1993 .

[7]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[8]  L. Cooper,et al.  When Networks Disagree: Ensemble Methods for Hybrid Neural Networks , 1992 .

[9]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[10]  J. Mesirov,et al.  Hybrid system for protein secondary structure prediction. , 1992, Journal of molecular biology.

[11]  Joydeep Ghosh,et al.  Integration Of Neural Classifiers For Passive Sonar Signals , 1996 .

[12]  Richard Lippmann,et al.  Neural Network Classifiers Estimate Bayesian a posteriori Probabilities , 1991, Neural Computation.

[13]  Josef Skrzypek,et al.  Synergy of Clustering Multiple Back Propagation Networks , 1989, NIPS.

[14]  Peter E. Hart,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[15]  Kagan Tumer,et al.  Structural adaptation and generalization in supervised feed-forward networks , 1994 .