论文信息 - A Numerical Study on Learning Curves in Stochastic Multilayer Feedforward Networks

A Numerical Study on Learning Curves in Stochastic Multilayer Feedforward Networks

The universal asymptotic scaling laws proposed by Amari et al. are studied in large scale simulations using a CM5. Small stochastic multilayer feedforward networks trained with backpropagation are investigated. In the range of a large number of training patterns t, the asymptotic generalization error scales as 1/t as predicted. For a medium range t a faster 1/t2 scaling is observed. This effect is explained by using higher order corrections of the likelihood expansion. It is shown for small t that the scaling law changes drastically, when the network undergoes a transition from strong overfitting to effective learning.

[1] K. Takeuchi,et al. Asymptotic efficiency of statistical estimators : concepts and higher order asymptotic efficiency , 1981 .

[2] Shun-ichi Amari,et al. Differential-geometrical methods in statistics , 1985 .

[3] David Haussler,et al. What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[4] M. Opper,et al. On the ability of the optimal perceptron to generalise , 1990 .

[5] Sompolinsky,et al. Learning from examples in large neural networks. , 1990, Physical review letters.

[6] Heskes,et al. Learning processes in neural networks. , 1991, Physical review. A, Atomic, molecular, and optical physics.

[7] David Haussler,et al. Calculation of the learning curve of Bayes optimal classification algorithm for learning a perceptron with noise , 1991, COLT '91.

[8] Hansel,et al. Broken symmetries in multilayered perceptrons. , 1992, Physical review. A, Atomic, molecular, and optical physics.

[9] Sompolinsky,et al. Statistical mechanics of learning from examples. , 1992, Physical review. A, Atomic, molecular, and optical physics.

[10] Shun-ichi Amari,et al. Learning Curves, Model Selection and Complexity of Neural Networks , 1992, NIPS.

[11] T. Watkin,et al. THE STATISTICAL-MECHANICS OF LEARNING A RULE , 1993 .

[12] Shun-ichi Amari,et al. Statistical Theory of Learning Curves under Entropic Loss Criterion , 1993, Neural Computation.

[13] H. Schwarze,et al. Generalization in Fully Connected Committee Machines , 1993 .

[14] Oh,et al. Generalization in a two-layer neural network. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[15] Michael Finke,et al. Estimating A-Posteriori Probabilities using Stochastic Network Models , 1993 .

[16] D. Haussler,et al. Rigorous learning curve bounds from statistical mechanics , 1994, COLT '94.

[17] David Haussler,et al. Rigorous Learning Curve Bounds from Statistical Mechanics , 1994, COLT.

[18] Klaus-Robert Müller,et al. Statistical Theory of Overtraining - Is Cross-Validation Asymptotically Effective? , 1995, NIPS.

[19] Saad,et al. On-line learning in soft committee machines. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[20] Dana Ron,et al. An experimental and theoretical comparison of model selection methods , 1995, COLT '95.

[21] Saad,et al. Exact solution for on-line learning in multilayer neural networks. , 1995, Physical review letters.

[22] F. Komaki. On asymptotic properties of predictive distributions , 1996 .

[23] Klaus-Robert Müller,et al. Asymptotic statistical theory of overtraining and cross-validation , 1997, IEEE Trans. Neural Networks.

[24] Manfred Opper,et al. Statistical mechanics of generalization , 1998 .

[25] S. Amari,et al. LARGE SCALE SIMULATIONS FOR LEARNING CURVES , 2022 .