Bifurcations of Recurrent Neural Networks in Gradient Descent Learning

Asymptotic behavior of a recurrent neural network changes qualitatively at certain points in the parameter space, which are known as \bifurcation points". At bifurcation points, the output of a network can change discontinuously with the change of parameters and therefore convergence of gradient descent algorithms is not guaranteed. Furthermore, learning equations used for error gradient estimation can be unstable. However, some kinds of bifurcations are inevitable in training a recurrent network as an automaton or an oscillator. Some of the factors underlying successful training of recurrent networks are investigated, such as choice of initial connections, choice of input patterns, teacher forcing, and truncated learning equations.

[1]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[2]  P. J. Holmes,et al.  Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields , 1983, Applied Mathematical Sciences.

[3]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[4]  Pineda,et al.  Generalization of back-propagation to recurrent neural networks. , 1987, Physical review letters.

[5]  Fernando J. Pineda,et al.  Dynamics and architecture for neural computation , 1988, J. Complex..

[6]  Barak A. Pearlmutter Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.

[7]  K. Doya,et al.  Memorizing oscillatory patterns in the analog neuron network , 1989, International 1989 Joint Conference on Neural Networks.

[8]  James L. McClelland,et al.  Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.

[9]  S. Wiggins Introduction to Applied Nonlinear Dynamical Systems and Chaos , 1989 .

[10]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[11]  Kenji Doya,et al.  Adaptive neural oscillator using continuous-time back-propagation learning , 1989, Neural Networks.

[12]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[13]  Ronald J. Williams,et al.  Gradient-based learning algorithms for recurrent connectionist networks , 1990 .

[14]  S G Lisberger,et al.  Visual motion commands for pursuit eye movements in the cerebellum. , 1991, Science.

[15]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[16]  C. Lee Giles,et al.  Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks , 1992, Neural Computation.

[17]  K. Doya,et al.  Bifurcations in the learning of recurrent neural networks , 1992, [Proceedings] 1992 IEEE International Symposium on Circuits and Systems.

[18]  Kenji Doya,et al.  Universality of Fully-Connected Recurrent Neural Networks , 1993 .