Backpropagation through time and the brain

It has long been speculated that the backpropagation-of-error algorithm (backprop) may be a model of how the brain learns. Backpropagation-through-time (BPTT) is the canonical temporal-analogue to backprop used to assign credit in recurrent neural networks in machine learning, but there's even less conviction about whether BPTT has anything to do with the brain. Even in machine learning the use of BPTT in classic neural network architectures has proven insufficient for some challenging temporal credit assignment (TCA) problems that we know the brain is capable of solving. Nonetheless, recent work in machine learning has made progress in solving difficult TCA problems by employing novel memory-based and attention-based architectures and algorithms, some of which are brain inspired. Importantly, these recent machine learning methods have been developed in the context of, and with reference to BPTT, and thus serve to strengthen BPTT's position as a useful normative guide for thinking about temporal credit assignment in artificial and biological systems alike.

[1]  Wojciech Zaremba,et al.  An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[2]  Bruce L. McNaughton,et al.  An Information-Theoretic Approach to Deciphering the Hippocampal Code , 1992, NIPS.

[3]  M. Wilson,et al.  Coordinated memory replay in the visual cortex and hippocampus during sleep , 2007, Nature Neuroscience.

[4]  Francis Crick,et al.  The recent excitement about neural networks , 1989, Nature.

[5]  Yoshua Bengio,et al.  Unitary Evolution Recurrent Neural Networks , 2015, ICML.

[6]  Daniel L. K. Yamins,et al.  Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..

[7]  Daniel Cownden,et al.  Random feedback weights support learning in deep neural networks , 2014, ArXiv.

[8]  Alex Graves,et al.  Memory-Efficient Backpropagation Through Time , 2016, NIPS.

[9]  B. McNaughton,et al.  Reactivation of hippocampal ensemble memories during sleep. , 1994, Science.

[10]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[11]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[12]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[13]  Joel Z. Leibo,et al.  Unsupervised Predictive Memory in a Goal-Directed Agent , 2018, ArXiv.

[14]  Guillaume Charpiat,et al.  Training recurrent networks online without backtracking , 2015, ArXiv.

[15]  Stephen Grossberg,et al.  Competitive Learning: From Interactive Activation to Adaptive Resonance , 1987, Cogn. Sci..

[16]  Geoffrey E. Hinton,et al.  A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.

[17]  Michael I. Jordan,et al.  Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[18]  Brad E. Pfeiffer,et al.  Reverse Replay of Hippocampal Place Cells Is Uniquely Modulated by Changing Reward , 2016, Neuron.

[19]  Yann Ollivier,et al.  Unbiased Online Recurrent Optimization , 2017, ICLR.

[20]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[21]  L. Squire Memory and the hippocampus: a synthesis from findings with rats, monkeys, and humans. , 1992, Psychological review.

[22]  H. Seung,et al.  Learning in Spiking Neural Networks by Reinforcement of Stochastic Synaptic Transmission , 2003, Neuron.

[23]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[24]  Nicolas Usunier,et al.  Improving Neural Language Models with a Continuous Cache , 2016, ICLR.

[25]  David J. Foster Replay Comes of Age. , 2017, Annual review of neuroscience.

[26]  Jing Peng,et al.  An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.

[27]  T. Bliss,et al.  Long‐lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path , 1973, The Journal of physiology.

[28]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[29]  D. Johnston,et al.  Regulation of Synaptic Efficacy by Coincidence of Postsynaptic APs and EPSPs , 1997 .

[30]  Timothy P Lillicrap,et al.  Towards deep learning with segregated dendrites , 2016, eLife.

[31]  Yan Wu,et al.  Optimizing agent behavior over long time scales by transporting value , 2018, Nature Communications.

[32]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[33]  Alex Graves,et al.  The Kanerva Machine: A Generative Distributed Memory , 2018, ICLR.

[34]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[35]  David J. Foster,et al.  Reverse replay of behavioural sequences in hippocampal place cells during the awake state , 2006, Nature.

[36]  Sepp Hochreiter,et al.  RUDDER: Return Decomposition for Delayed Rewards , 2018, NeurIPS.

[37]  P. Frankland,et al.  The organization of recent and remote memories , 2005, Nature Reviews Neuroscience.

[38]  Henry Markram,et al.  Real-Time Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations , 2002, Neural Computation.

[39]  Alex Graves,et al.  Associative Long Short-Term Memory , 2016, ICML.

[40]  Daan Wierstra,et al.  One-shot Learning with Memory-Augmented Neural Networks , 2016, ArXiv.

[41]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[42]  Christopher Joseph Pal,et al.  Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding , 2018, NeurIPS.

[43]  James J. Knierim,et al.  CA3 Retrieves Coherent Representations from Degraded Input: Direct Evidence for CA3 Pattern Completion and Dentate Gyrus Pattern Separation , 2014, Neuron.

[44]  Nando de Freitas,et al.  Cortical microcircuits as gated-recurrent neural networks , 2017, NIPS.

[45]  G. Winocur,et al.  Memory Transformation and Systems Consolidation , 2011, Journal of the International Neuropsychological Society.

[46]  E. Todorov,et al.  A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[47]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[48]  Angelika Steger,et al.  Approximating Real-Time Recurrent Learning with Random Kronecker Factors , 2018, NeurIPS.

[49]  Surya Ganguli,et al.  Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.

[50]  P J Webros BACKPROPAGATION THROUGH TIME: WHAT IT DOES AND HOW TO DO IT , 1990 .

[51]  Lise Getoor,et al.  Learning in Logic , 2010, Encyclopedia of Machine Learning.

[52]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[53]  Alex Graves,et al.  Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes , 2016, NIPS.

[54]  J. Knott The organization of behavior: A neuropsychological theory , 1951 .

[55]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[56]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[57]  F. Attneave,et al.  The Organization of Behavior: A Neuropsychological Theory , 1949 .

[58]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[59]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[60]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[61]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[62]  Yann LeCun,et al.  Une procedure d'apprentissage pour reseau a seuil asymmetrique (A learning scheme for asymmetric threshold networks) , 1985 .

[63]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[64]  PAUL J. WERBOS,et al.  Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[65]  Honglak Lee,et al.  Control of Memory, Active Perception, and Action in Minecraft , 2016, ICML.