论文信息 - An Efficient EM-based Training Algorithm for Feedforward Neural Networks

An Efficient EM-based Training Algorithm for Feedforward Neural Networks

A fast training algorithm is developed for two-layer feedforward neural networks based on a probabilistic model for hidden representations and the EM algorithm. The algorithm decomposes training the original two-layer networks into training a set of single neurons. The individual neurons are then trained via a linear weighted regression algorithm. Significant improvement on training speed has been made using this algorithm for several bench-mark problems. Copyright 1997 Elsevier Science Ltd. All Rights Reserved.

Chuanyi Ji | Sheng Ma | James Farmer

[1] Bernard Widrow,et al. Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[2] Arthur Nádas,et al. Binary classification by stochastic neural nets , 1995, IEEE Trans. Neural Networks.

[3] Anders Krogh,et al. A Cost Function for Internal Representations , 1989, NIPS.

[4] John S. Denker,et al. Neural Networks for Computing , 1998 .

[5] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[6] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[7] David J. C. MacKay,et al. A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[8] Halbert White,et al. Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.

[9] Esther Levin,et al. A statistical approach to learning and generalization in layered neural networks , 1989, COLT '89.

[10] M. Niranjan,et al. A Dynamic Neural Network Architecture by Sequential Partitioning of the Input Space , 1994, Neural Computation.

[11] Michael I. Jordan,et al. Hierarchical Mixtures of Experts and the EM Algorithm , 1994 .

[12] William J. Byrne,et al. Alternating minimization and Boltzmann machine learning , 1992, IEEE Trans. Neural Networks.

[13] Michael I. Jordan,et al. Hierarchical Mixtures of Experts and the EM Algorithm , 1994, Neural Computation.

[14] David Saad,et al. Learning by Choice of Internal Representations: An Energy Minimization Approach , 1990, Complex Syst..

[15] Shun-ichi Amari,et al. Information geometry of Boltzmann machines , 1992, IEEE Trans. Neural Networks.

[16] Esther Levin,et al. A statistical approach to learning and generalization in layered neural networks , 1989, Proc. IEEE.

[17] Christian Lebiere,et al. The Cascade-Correlation Learning Architecture , 1989, NIPS.

[18] Shun-ichi Amari,et al. The EM Algorithm and Information Geometry in Neural Network Learning , 1995, Neural Computation.

[19] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[20] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[21] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[22] Donald Geman,et al. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[23] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .