Local online learning in recurrent networks with random feedback

Recurrent neural networks (RNNs) enable the production and processing of time-dependent signals such as those involved in movement or working memory. Classic gradient-based algorithms for training RNNs have been available for decades, but are inconsistent with biological features of the brain, such as causality and locality. We derive an approximation to gradient-based learning that comports with these constraints by requiring synaptic weight updates to depend only on local information about pre- and postsynaptic activities, in addition to a random feedback projection of the RNN output error. In addition to providing mathematical arguments for the effectiveness of the new learning rule, we show through simulations that it can be used to train an RNN to perform a variety of tasks. Finally, to overcome the difficulty of training over very large numbers of timesteps, we propose an augmented circuit architecture that allows the RNN to concatenate short-duration patterns into longer sequences.

[1]  N. Parga,et al.  Dynamic Control of Response Criterion in Premotor Cortex during Perceptual Detection under Temporal Uncertainty , 2015, Neuron.

[2]  Xin Jin,et al.  Start/stop signals emerge in nigrostriatal circuits during sequence learning , 2010, Nature.

[3]  Colin J. Akerman,et al.  Random synaptic feedback weights support error backpropagation for deep learning , 2016, Nature Communications.

[4]  Joel Z. Leibo,et al.  How Important Is Weight Symmetry in Backpropagation? , 2015, AAAI.

[5]  Michael I. Jordan,et al.  A more biologically plausible learning rule for neural networks. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[6]  L. F. Abbott,et al.  full-FORCE: A target-based method for training recurrent networks , 2017, PloS one.

[7]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[8]  Wolfgang Maass,et al.  Emergence of complex computational structures from chaotic neural networks through reward-modulated Hebbian learning. , 2014, Cerebral cortex.

[9]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[10]  Dean V. Buonomano,et al.  ROBUST TIMING AND MOTOR PATTERNS BY TAMING CHAOS IN RECURRENT NEURAL NETWORKS , 2012, Nature Neuroscience.

[11]  Scott W. Linderman,et al.  The Striatum Organizes 3D Behavior via Moment-to-Moment Action Selection , 2018, Cell.

[12]  Yoshua Bengio,et al.  Dendritic error backpropagation in deep cortical microcircuits , 2017, ArXiv.

[13]  Matthew T. Kaufman,et al.  A neural network that finds a naturalistic solution for the production of muscle activity , 2015, Nature Neuroscience.

[14]  Luke T. Coddington,et al.  The timing of action determines reward prediction signals in identified midbrain dopamine neurons , 2018, Nature Neuroscience.

[15]  Razvan Pascanu,et al.  How to Construct Deep Recurrent Neural Networks , 2013, ICLR.

[16]  Devika Narain,et al.  Flexible Sensorimotor Computations through Rapid Reconfiguration of Cortical Dynamics , 2018, Neuron.

[17]  A. Graybiel The Basal Ganglia and Chunking of Action Repertoires , 1998, Neurobiology of Learning and Memory.

[18]  Thomas Miconi,et al.  Biologically plausible learning in recurrent neural networks reproduces neural dynamics observed during cognitive tasks , 2016, bioRxiv.

[19]  H. Dale Pharmacology and Nerve-Endings , 1935 .

[20]  Wulfram Gerstner,et al.  Predicting non-linear dynamics by stable local learning in a recurrent spiking neural network , 2017, eLife.

[21]  Jean-Jacques Slotine,et al.  Learning arbitrary dynamics in efficient, balanced spiking networks using local plasticity rules , 2017, AAAI 2017.

[22]  Wolfgang Maass,et al.  A Reward-Modulated Hebbian Learning Rule Can Explain Experimentally Observed Network Reorganization in a Brain Control Task , 2010, The Journal of Neuroscience.

[23]  Yann Ollivier,et al.  Unbiased Online Recurrent Optimization , 2017, ICLR.

[24]  L. F. Abbott,et al.  Generating Coherent Patterns of Activity from Chaotic Neural Networks , 2009, Neuron.

[25]  W. Newsome,et al.  Context-dependent computation by recurrent dynamics in prefrontal cortex , 2013, Nature.

[26]  Qian Li,et al.  Refinement of learned skilled movement representation in motor cortex deep output layer , 2017, Nature Communications.

[27]  Timothy P Lillicrap,et al.  Towards deep learning with segregated dendrites , 2016, eLife.

[28]  R. Costa,et al.  Dopamine neuron activity before action initiation gates and invigorates future movements , 2018, Nature.

[29]  Michael N. Shadlen,et al.  Temporal context calibrates interval timing , 2010, Nature Neuroscience.

[30]  Wulfram Gerstner,et al.  Eligibility Traces and Plasticity on Behavioral Time Scales: Experimental Support of NeoHebbian Three-Factor Learning Rules , 2018, Front. Neural Circuits.

[31]  Arild Nøkland,et al.  Direct Feedback Alignment Provides Learning in Deep Neural Networks , 2016, NIPS.

[32]  Harald Haas,et al.  Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.

[33]  C. Gerfen,et al.  Modulation of striatal projection systems by dopamine. , 2011, Annual review of neuroscience.

[34]  Ila R Fiete,et al.  Gradient learning in spiking neural networks by dynamic perturbation of conductances. , 2006, Physical review letters.

[35]  M. Howe,et al.  Rapid signaling in distinct dopaminergic axons during locomotion and reward , 2016, Nature.

[36]  Timothy P Lillicrap,et al.  Deep Learning with Dynamic Spiking Neurons and Fixed Feedback Weights , 2017, Neural Computation.

[37]  Ana Pekanovic,et al.  Dopaminergic Projections from Midbrain to Primary Motor Cortex Mediate Motor Skill Learning , 2011, The Journal of Neuroscience.

[38]  U. R. Prasad,et al.  Back propagation through adjoints for the identification of nonlinear dynamic systems using recurrent neural models , 1994, IEEE Trans. Neural Networks.

[39]  Guangyu R. Yang,et al.  Training Excitatory-Inhibitory Recurrent Neural Networks for Cognitive Tasks: A Simple and Flexible Framework , 2016, PLoS Comput. Biol..

[40]  Ryan P. Adams,et al.  Mapping Sub-Second Structure in Mouse Behavior , 2015, Neuron.

[41]  Eric A. Wan,et al.  Relating Real-Time Backpropagation and Backpropagation-Through-Time: An Application of Flow Graph Interreciprocity , 1994, Neural Computation.