Learning in chaotic recurrent neural networks

Training recurrent neural networks (RNNs) is a long-standing open problem both in theoretical neuroscience and machine learning. In particular, training chaotic RNNs was previously thought to be impossible. While some traditional methods for training RNNs exist, they are generally thought of as weak and typically fail on anything but the simplest of problems and smallest networks. We review previous methods such as gradient descent approaches and their problems, and we also review more recent approaches such as the Echostate Network and related ideas. We show that chaotic RNNs can be trained to generate multiple patterns. Further, we explain a novel supervised learning paradigm, which we call FORCE learning, that accomplishes the training. The network architectures we analyze, on the one extreme, include training only the input weights to a readout unit that has strong feedback to the network, and on the other extreme, involve generic learning of all synapses within the RNN. We present these models as potential networks for motor pattern generation that are able to learn multiple, high-dimensional patterns while coping with the complexities of a recurrent network that may have spontaneous, ongoing, and complex dynamics. We show an example of a single RNN that can generate the aperiodic dynamics of all 95 joint angles for both human walking and running motions captured via motion capture technology. Finally, we apply the learning techniques we developed for chaotic RNNs to a novel, unsupervised method for extracting predictable signals out of high-dimensional time series data, if such predictable signals exists.

[1]  D. G. Bounds,et al.  A multilayer perceptron network for the diagnosis of low back pain , 1988, IEEE 1988 International Conference on Neural Networks.

[2]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[3]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[4]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[5]  Henry Markram,et al.  Real-Time Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations , 2002, Neural Computation.

[6]  D. Wolpert,et al.  Is the cerebellum a smith predictor? , 1993, Journal of motor behavior.

[7]  Nils Bertschinger,et al.  Real-Time Computation at the Edge of Chaos in Recurrent Neural Networks , 2004, Neural Computation.

[8]  James M. Jeanne,et al.  Estimation of parameters in nonlinear systems using balanced synchronization. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  Ronald J. Williams,et al.  Experimental Analysis of the Real-time Recurrent Learning Algorithm , 1989 .

[10]  E. Fetz Movement control: Are movement parameters recognizably coded in the activity of single neurons? , 1992 .

[11]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[12]  Nicolas Brunel,et al.  Dynamics of networks of randomly connected excitatory and inhibitory spiking neurons , 2000, Journal of Physiology-Paris.

[13]  H. Markram,et al.  The neural code between neocortical pyramidal neurons depends on neurotransmitter release probability. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[15]  Barak A. Pearlmutter Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.

[16]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[17]  H. Sompolinsky,et al.  Chaos in Neuronal Networks with Balanced Excitatory and Inhibitory Activity , 1996, Science.

[18]  A. Lansner,et al.  The cortex as a central pattern generator , 2005, Nature Reviews Neuroscience.

[19]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Henry Markram,et al.  Neural Networks with Dynamic Synapses , 1998, Neural Computation.

[21]  Marvin Minsky,et al.  Perceptrons: An Introduction to Computational Geometry , 1969 .

[22]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[23]  Amir F. Atiya,et al.  New results on recurrent network training: unifying the algorithms and accelerating convergence , 2000, IEEE Trans. Neural Networks Learn. Syst..

[24]  Schuster,et al.  Suppressing chaos in neural networks by noise. , 1992, Physical review letters.

[25]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[26]  Pc Pandey,et al.  Multilayer perceptron in damage detection of bridge structures , 1995 .

[27]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[28]  Harald Haas,et al.  Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.

[29]  D. Amit,et al.  Model of global spontaneous activity and local structured activity during delay periods in the cerebral cortex. , 1997, Cerebral cortex.

[30]  K. Shenoy,et al.  Delay of movement caused by disruption of cortical preparatory activity. , 2007, Journal of neurophysiology.

[31]  D. Robinson Movement control: Implications of neural networks for how we think about brain function , 1992 .

[32]  Sommers,et al.  Chaos in random neural networks. , 1988, Physical review letters.

[33]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[34]  J J Hopfield,et al.  Neurons with graded response have collective computational properties like those of two-state neurons. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[35]  Sanjoy Dasgupta,et al.  An elementary proof of a theorem of Johnson and Lindenstrauss , 2003, Random Struct. Algorithms.

[36]  Eduardo D. Sontag,et al.  Computational Aspects of Feedback in Neural Circuits , 2006, PLoS Comput. Biol..

[37]  Wolfgang Maass,et al.  A Statistical Analysis of Information- Processing Properties of Lamina-specific Cortical Microcircuit Models , 2022 .

[38]  Geoffrey E. Hinton,et al.  Modeling Human Motion Using Binary Latent Variables , 2006, NIPS.

[39]  Byron M. Yu,et al.  Neural Variability in Premotor Cortex Provides a Signature of Motor Preparation , 2006, The Journal of Neuroscience.

[40]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.