论文信息 - Effect of Batch Learning in Multilayer Neural Networks

Effect of Batch Learning in Multilayer Neural Networks

This paper discusses batch gradient descent learning in mul-tilayer networks with a large number of statistical training data. We emphasize on the diierence between regular cases, where the prepared model has the same size as the true function , and overrealizable cases, where the model has surplus hidden units to realize the true function. First, experimental study on multilayer perceptrons and linear neural networks (LNN) shows that batch learning induces strong overtrain-ing on both models in overrealizable cases, which means the degrade of generalization error by surplus units can be alleviated. We theoretically analyze the dynamics in LNN, and show that this overtraining is caused by shrinkage of the parameters corresponding to surplus units.

Kenji Fukumizu | K. Fukumizu

[1] E. Oja. Simplified neuron model as a principal component analyzer , 1982, Journal of mathematical biology.

[2] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[3] John B. Moore,et al. Global analysis of Oja's flow for neural networks , 1994, IEEE Trans. Neural Networks.

[4] Klaus-Robert Müller,et al. Statistical Theory of Overtraining - Is Cross-Validation Asymptotically Effective? , 1995, NIPS.

[5] Kurt Hornik,et al. Learning in linear neural networks: a survey , 1995, IEEE Trans. Neural Networks.

[6] Kenji Fukumizu,et al. A Regularity Condition of the Information Matrix of a Multilayer Perceptron Network , 1996, Neural Networks.

[7] Special Statistical Properties of Neural Network Learning , 1997 .