A solution to the learning dilemma for recurrent networks of spiking neurons

Recurrently connected networks of spiking neurons underlie the astounding information processing capabilities of the brain. Yet in spite of extensive research, how they can learn through synaptic plasticity to carry out complex network computations remains unclear. We argue that two pieces of this puzzle were provided by experimental data from neuroscience. A mathematical result tells us how these pieces need to be combined to enable biologically plausible online network learning through gradient descent, in particular deep reinforcement learning. This learning method–called e-prop–approaches the performance of backpropagation through time (BPTT), the best-known method for training recurrent neural networks in machine learning. In addition, it suggests a method for powerful on-chip learning in energy-efficient spike-based hardware for artificial intelligence. Bellec et al. present a mathematically founded approximation for gradient descent training of recurrent neural networks without backwards propagation in time. This enables biologically plausible training of spike-based neural network models with working memory and supports on-chip training of neuromorphic hardware.

[1]  Wulfram Gerstner,et al.  Eligibility Traces and Plasticity on Behavioral Time Scales: Experimental Support of NeoHebbian Three-Factor Learning Rules , 2018, Front. Neural Circuits.

[2]  James M Murray,et al.  Local online learning in recurrent networks with random feedback , 2018, bioRxiv.

[3]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Pieter R. Roelfsema,et al.  A Biologically Plausible Learning Rule for Deep Learning in the Brain , 2018, ArXiv.

[5]  Jonathan G. Fiscus,et al.  DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .

[6]  Wulfram Gerstner,et al.  Neuronal Dynamics: From Single Neurons To Networks And Models Of Cognition , 2014 .

[7]  Arild Nøkland,et al.  Direct Feedback Alignment Provides Learning in Deep Neural Networks , 2016, NIPS.

[8]  Somnath Paul,et al.  Event-driven random backpropagation: Enabling neuromorphic deep learning machines , 2017, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).

[9]  Wulfram Gerstner,et al.  Predicting non-linear dynamics by stable local learning in a recurrent spiking neural network , 2017, eLife.

[10]  Ilana B. Witten,et al.  Specialized coding of sensory, motor, and cognitive variables in VTA dopamine neurons , 2019, Nature.

[11]  L. F. Abbott,et al.  Generating Coherent Patterns of Activity from Chaotic Neural Networks , 2009, Neuron.

[12]  G. Laurent,et al.  Conditional modulation of spike-timing-dependent plasticity for olfactory learning , 2012, Nature.

[13]  Jacques Kaiser,et al.  Synaptic Plasticity Dynamics for Deep Continuous Local Learning (DECOLLE) , 2018, Frontiers in Neuroscience.

[14]  Jean-Jacques Slotine,et al.  Learning arbitrary dynamics in efficient, balanced spiking networks using local plasticity rules , 2017, AAAI 2017.

[15]  Alex Graves,et al.  Decoupled Neural Interfaces using Synthetic Gradients , 2016, ICML.

[16]  Wilten Nicola,et al.  Supervised learning in spiking neural networks with FORCE training , 2016, Nature Communications.

[17]  栁下 祥 A critical time window for dopamine actions on the structural plasticity of dendritic spines , 2016 .

[18]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 2005, IEEE Transactions on Neural Networks.

[19]  Steve B. Furber,et al.  The SpiNNaker Project , 2014, Proceedings of the IEEE.

[20]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[21]  W. Gerstner,et al.  Neuromodulated Spike-Timing-Dependent Plasticity, and Theory of Three-Factor Learning Rules , 2016, Front. Neural Circuits.

[22]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[23]  John Lisman,et al.  The CaMKII/NMDAR complex as a molecular memory , 2013, Molecular Brain.

[24]  W. Gerstner,et al.  Temporal whitening by power-law adaptation in neocortical neurons , 2013, Nature Neuroscience.

[25]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[26]  W. Gerstner,et al.  Connectivity reflects coding: a model of voltage-based STDP with homeostasis , 2010, Nature Neuroscience.

[27]  Hesham Mostafa,et al.  Surrogate Gradient Learning in Spiking Neural Networks: Bringing the Power of Gradient-based optimization to spiking neural networks , 2019, IEEE Signal Processing Magazine.

[28]  Robert A. Legenstein,et al.  Biologically inspired alternatives to backpropagation through time for learning in recurrent neural nets , 2019, ArXiv.

[29]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[30]  James M Murray,et al.  Local online learning in recurrent networks with random feedback , 2018, bioRxiv.

[31]  Yann Ollivier,et al.  Unbiased Online Recurrent Optimization , 2017, ICLR.

[32]  Timothy P Lillicrap,et al.  Deep Learning with Dynamic Spiking Neurons and Fixed Feedback Weights , 2017, Neural Computation.

[33]  Terrence J. Sejnowski,et al.  Gradient Descent for Spiking Neural Networks , 2017, NeurIPS.

[34]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[35]  Stefano Panzeri,et al.  Distinct timescales of population coding across cortex , 2017, Nature.

[36]  Yoshua Bengio,et al.  Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[37]  Timothy P Lillicrap,et al.  Backpropagation through time and the brain , 2019, Current Opinion in Neurobiology.

[38]  L. F. Abbott,et al.  Building functional networks of spiking model neurons , 2016, Nature Neuroscience.

[39]  J. Roeper Dissecting the diversity of midbrain dopamine neurons , 2013, Trends in Neurosciences.

[40]  Ioana Sporea,et al.  Supervised Learning in Multilayer Spiking Neural Networks , 2012, Neural Computation.

[41]  Ingmar Kanitscheider,et al.  Kernel RNN Learning (KeRNL) , 2018, ICLR.

[42]  Andrew S. Cassidy,et al.  Convolutional networks for fast, energy-efficient neuromorphic computing , 2016, Proceedings of the National Academy of Sciences.

[43]  Amirsaman Sajad,et al.  Cortical Microcircuitry of Performance Monitoring , 2018, Nature Neuroscience.

[44]  Robert A. Legenstein,et al.  Long short-term memory and Learning-to-learn in networks of spiking neurons , 2018, NeurIPS.

[45]  Pieter R. Roelfsema,et al.  Control of synaptic plasticity in deep cortical networks , 2018, Nature Reviews Neuroscience.

[46]  Colin J. Akerman,et al.  Random synaptic feedback weights support error backpropagation for deep learning , 2016, Nature Communications.

[47]  Hilbert J. Kappen,et al.  Learning Universal Computations with Spikes , 2015, PLoS Comput. Biol..

[48]  Christopher Kim,et al.  Learning recurrent dynamics in spiking networks , 2018, bioRxiv.

[49]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[50]  Yoshua Bengio,et al.  BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 , 2016, ArXiv.

[51]  Somnath Paul,et al.  Event-Driven Random Back-Propagation: Enabling Neuromorphic Deep Learning Machines , 2016, Front. Neurosci..

[52]  Hong Wang,et al.  Loihi: A Neuromorphic Manycore Processor with On-Chip Learning , 2018, IEEE Micro.

[53]  Kyunghyun Cho,et al.  A Unified Framework of Online Learning Algorithms for Training Recurrent Neural Networks , 2019, J. Mach. Learn. Res..

[54]  Angelika Steger,et al.  Approximating Real-Time Recurrent Learning with Random Kronecker Factors , 2018, NeurIPS.

[55]  Garrick Orchard,et al.  SLAYER: Spike Layer Error Reassignment in Time , 2018, NeurIPS.

[56]  Geoffrey E. Hinton,et al.  Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures , 2018, NeurIPS.

[57]  Alessandro Ingrosso,et al.  Training dynamically balanced excitatory-inhibitory networks , 2018, PloS one.

[58]  Ari S. Morcos,et al.  History-dependent variability in population dynamics during evidence accumulation in cortex , 2016, Nature Neuroscience.

[59]  Surya Ganguli,et al.  SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks , 2017, Neural Computation.

[60]  Jean-Jacques E. Slotine,et al.  Learning Nonlinear Dynamics in Efficient, Balanced Spiking Networks Using Local Plasticity Rules , 2018, AAAI.

[61]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[62]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[63]  David Kappel,et al.  A Dynamic Connectome Supports the Emergence of Stable Computational Function of Neural Circuits through Reward-Based Learning , 2017, eNeuro.