论文信息 - Functional optimization of online algorithms in multilayer neural networks

Functional optimization of online algorithms in multilayer neural networks

We study the online dynamics of learning in fully connected soft committee machines in the student - teacher scenario. The locally optimal modulation function, which determines the learning algorithm, is obtained from a variational argument in such a manner as to maximize the average generalization error decay per example. Simulations results for the resulting algorithm are presented for a few cases. The symmetric phase plateaux are found to be vastly reduced in comparison to those found when online backpropagation algorithms are used. A discussion of the implementation of these ideas as practical algorithms is given.

Nestor Caticha | Renato Vicente

[1] Manfred Opper,et al. Statistical mechanics of generalization , 1998 .

[2] M. Opper,et al. 5 Statistical Mechanics of Generalization , .

[3] David Saad,et al. On-line learning with adaptive back-propagation in two-layer networks , 1997 .

[4] Saad,et al. On-line learning in soft committee machines. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[5] T. Watkin,et al. THE STATISTICAL-MECHANICS OF LEARNING A RULE , 1993 .

[6] Michael Biehl,et al. On-Line Learning with a Perceptron , 1994 .

[7] Reimann,et al. Unsupervised learning by examples: On-line versus off-line. , 1996, Physical review letters.

[8] Sompolinsky,et al. Statistical mechanics of learning from examples. , 1992, Physical review. A, Atomic, molecular, and optical physics.

[9] Michael Biehl,et al. Transient dynamics of on-line learning in two-layered neural networks , 1996 .

[10] O. Kinouchi,et al. Optimal generalization in perceptions , 1992 .

[11] O. Kinouchi,et al. Lower bounds on generalization errors for drifting rules , 1993 .

[12] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[13] Michael Biehl,et al. On-Line Learning of a Time-Dependent Rule , 1992 .

[14] Michael Biehl,et al. Noise robustness in multilayer neural networks , 1997 .

[15] W. Kinzel. Physics of Neural Networks , 1990 .

[16] O. Kinouchi,et al. Learning algorithm that gives the Bayes generalization limit for perceptrons. , 1996, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[17] N. Caticha,et al. On-line learning in parity machines , 1996 .

[18] Magnus Rattray,et al. Globally optimal parameters for on-line learning in multilayer neural networks , 1997 .

[19] Michael Biehl,et al. Learning by on-line gradient descent , 1995 .

[20] N. Caticha,et al. On-line learning in the committee machine , 1995 .

[21] Opper. On-line versus Off-line Learning from Random Examples: General Results. , 1996, Physical review letters.