论文信息 - When Networks Disagree: Ensemble Methods for Hybrid Neural Networks

When Networks Disagree: Ensemble Methods for Hybrid Neural Networks

Abstract : This paper presents a general theoretical framework for ensemble methods of constructing significantly improved regression estimates. Given a population of regression estimators, the authors construct a hybrid estimator that is as good or better in the mean square error sense than any estimator in the population. They argue that the ensemble method presented has several properties: (1) it efficiently uses all the networks of a population -- none of the networks need to be discarded; (2) it efficiently uses all of the available data for training without over-fitting; (3) it inherently performs regularization by smoothing in functional space, which helps to avoid over-fitting; (4) it utilizes local minima to construct improved estimates whereas other neural network algorithms are hindered by local minima; (5) it is ideally suited for parallel computation; (6) it leads to a very useful and natural measure of the number of distinct estimators in a population; and (7) the optimal parameters of the ensemble estimator are given in closed form. Experimental results show that the ensemble method dramatically improves neural network performance on difficult real-world optical character recognition tasks.

L. Cooper | M. Perrone

[1] W. Strawderman. The Generalized Jackknife Statistic , 1973 .

[2] Rupert G. Miller. The jackknife-a review , 1974 .

[3] Farhad Mehran,et al. The Generalized Jackknife Statistic , 1975 .

[4] M. Stone,et al. Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[5] B. Efron. The jackknife, the bootstrap, and other resampling plans , 1987 .

[6] Josef Skrzypek,et al. Synergy of Clustering Multiple Back Propagation Networks , 1989, NIPS.

[7] W. Härdle. Applied Nonparametric Regression , 1991 .

[8] G. Wahba. Spline models for observational data , 1990 .

[9] Stephen Cox,et al. RecNorm: Simultaneous Normalisation and Classification Applied to Speech Recognition , 1990, NIPS.

[10] Barak A. Pearlmutter,et al. Chaitin-Kolmogorov Complexity and Generalization in Neural Networks , 1990, NIPS.

[11] Christopher L. Scofield,et al. Multiple neural net architectures for character recognition , 1991, COMPCON Spring '91 Digest of Papers.