论文信息 - Combining locally trained neural networks by introducing a reject class

Combining locally trained neural networks by introducing a reject class

This paper presents a new strategy for building and combining a local committee when a dataset is given. Training local committees is performed in two stages: active data partitioning and recombination by introducing an additional reject class. Active data partitioning is a preprocessing step that partitions the given dataset into several similar subsets using active learning. Additional reject class in this strategy plays an important role in assigning a focused area to each individual network of the committee. For combining the outputs of each individual network, we use a kind of sum rule criteria, assuming that the outputs of the individuals are equivalent to a posteriori Bayesian probabilities. All the learning procedures are based on the active learning paradigm. Experiments are performed on the two real-world datasets from the UCI machine learning database. The results show that the active data partitioning and recombining strategy is very successful for building a local committee and the combined result outperforms other algorithms, but the combined result can be affected by the training error level /spl epsiv/.

Byoung-Tak Zhang | Suk-Joon Kim

[1] Barak A. Pearlmutter,et al. Equivalence Proofs for Multi-Layer Perceptron Classifiers and the Bayesian Discriminant Function , 1991 .

[2] Bruce W. Suter,et al. The multilayer perceptron as an approximation to a Bayes optimal discriminant function , 1990, IEEE Trans. Neural Networks.

[3] Byoung-Tak Zhang,et al. Active Data Partitioning for Building Mixture Models , 1998, ICONIP.

[4] Thomas Martinetz,et al. 'Neural-gas' network for vector quantization and its application to time-series prediction , 1993, IEEE Trans. Neural Networks.

[5] Michael I. Jordan,et al. Local linear perceptrons for classification , 1996, IEEE Trans. Neural Networks.

[6] Léon Bottou,et al. Local Learning Algorithms , 1992, Neural Computation.

[7] Robert L. Winkler,et al. Limits for the Precision and Value of Information from Dependent Sources , 1985, Oper. Res..

[8] Geoffrey E. Hinton,et al. Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[9] Jiri Matas,et al. On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[10] L. Cooper,et al. When Networks Disagree: Ensemble Methods for Hybrid Neural Networks , 1992 .

[11] Roderick Murray-Smith,et al. A local model network approach to nonlinear modelling , 1994 .

[12] Byoung-Tak Zhang,et al. Accelerated Learning by Active Example Selection , 1994, Int. J. Neural Syst..

[13] Ron Meir,et al. Bias, Variance and the Combination of Least Squares Estimators , 1994, NIPS.

[14] Alexander H. Waibel,et al. Connectionist Architectures for Multi-Speaker Phoneme Recognition , 1989, NIPS.

[15] Robert A. Jacobs,et al. Bias/Variance Analyses of Mixtures-of-Experts Architectures , 1997, Neural Computation.

[16] Xin Yao,et al. A new evolutionary system for evolving artificial neural networks , 1997, IEEE Trans. Neural Networks.

[17] John A. Hertz,et al. Exploiting Neurons with Localized Receptive Fields to Learn Chaos , 1990, Complex Syst..