Transient dynamics of on-line learning in two-layered neural networks

The dynamics of on-line learning in neural networks with continuous units is dominated by plateaux in the time dependence of the generalization error. Using tools from statistical mechanics, we show for a soft committee machine the existence of several fixed points of the dynamics of learning that give rise to complicated behaviour, such as cascade- like runs through different plateaux with a decreasing value of the corresponding generalization error. We find learning-rate-dependent phenomena, such as splitting and disappearing of fixed points of the equations of motion. The dependence of plateau lengths on the initial conditions is described analytically and simulations confirm the results.

[1]  Shun-ichi Amari,et al.  A Theory of Adaptive Pattern Classifiers , 1967, IEEE Trans. Electron. Comput..

[2]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[3]  O. Kinouchi,et al.  Optimal generalization in perceptions , 1992 .

[4]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[5]  Sompolinsky,et al.  Statistical mechanics of learning from examples. , 1992, Physical review. A, Atomic, molecular, and optical physics.

[6]  Shun-ichi Amari,et al.  Backpropagation and stochastic gradient descent method , 1993, Neurocomputing.

[7]  T. Watkin,et al.  THE STATISTICAL-MECHANICS OF LEARNING A RULE , 1993 .

[8]  R. Palmer,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[9]  Michael Biehl,et al.  On-line backpropagation in two-layered neural networks , 1995 .

[10]  N. Caticha,et al.  On-line learning in the committee machine , 1995 .

[11]  Sompolinsky,et al.  Local and global convergence of on-line learning. , 1995, Physical review letters.

[12]  Saad,et al.  On-line learning in soft committee machines. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[13]  Michael Biehl,et al.  Learning by on-line gradient descent , 1995 .

[14]  Saad,et al.  Exact solution for on-line learning in multilayer neural networks. , 1995, Physical review letters.

[15]  David Barber,et al.  Finite-size effects in on-line learning of multilayer neural networks , 1996 .