Backpropagation and the brain

During learning, the brain modifies synapses to improve behaviour. In the cortex, synapses are embedded within multilayered networks, making it difficult to determine the effect of an individual synaptic modification on the behaviour of the system. The backpropagation algorithm solves this problem in deep artificial neural networks, but historically it has been viewed as biologically problematic. Nonetheless, recent developments in neuroscience and the successes of artificial neural networks have reinvigorated interest in whether backpropagation offers insights for understanding learning in the cortex. The backpropagation algorithm learns quickly by computing synaptic updates using feedback connections to deliver error signals. Although feedback connections are ubiquitous in the cortex, it is difficult to see how they could deliver the error signals required by strict formulations of backpropagation. Here we build on past and recent developments to argue that feedback connections may instead induce neural activities whose differences can be used to locally approximate these signals and hence drive effective learning in deep networks in the brain. The backpropagation of error (backprop) algorithm is frequently used to train deep neural networks in machine learning, but it has not been viewed as being implemented by the brain. In this Perspective, however, Lillicrap and colleagues argue that the key principles underlying backprop may indeed have a role in brain function.

[1]  PHARMACOLOGY AND NERVE ENDINGS , 1934 .

[2]  H. Dale Pharmacology and Nerve-Endings , 1935 .

[3]  V. Mountcastle Modality and topographic properties of single neurons of cat's somatic sensory cortex. , 1957, Journal of neurophysiology.

[4]  D Marr,et al.  Simple memory: a theory for archicortex. , 1971, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[5]  T. Bliss,et al.  Long‐lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path , 1973, The Journal of physiology.

[6]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[7]  N. Osborne Is Dale's principle valid? , 1979, Trends in Neurosciences.

[8]  P. Andersen,et al.  Possible mechanisms for long‐lasting potentiation of synaptic transmission in hippocampal slices from guinea‐pigs. , 1980, The Journal of physiology.

[9]  T. O'donohue,et al.  On the 50th anniversary of Dale's law: multiple neurotransmitter neurons , 1985 .

[10]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[11]  R. Desimone,et al.  Selective attention gates visual processing in the extrastriate cortex. , 1985, Science.

[12]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[13]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[14]  Y. L. Cun Learning Process in an Asymmetric Threshold Network , 1986 .

[15]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[16]  Geoffrey E. Hinton,et al.  Learning representations of back-propagation errors , 1986 .

[17]  Stephen Grossberg,et al.  Competitive Learning: From Interactive Activation to Adaptive Resonance , 1987, Cogn. Sci..

[18]  B. C. Motter,et al.  Common and differential effects of attentive fixation on the excitability of parietal and prestriate (V4) cortical visual neurons in the macaque monkey , 1987, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[19]  Stephen Grossberg,et al.  From Interactive Activation to Adaptive Resonance , 1987 .

[20]  Pineda,et al.  Generalization of back-propagation to recurrent neural networks. , 1987, Physical review letters.

[21]  Stephen Grossberg Competitive Learning: From Interactive Activation to Adaptive Resonance , 1987 .

[22]  Yann LeCun,et al.  Modeles connexionnistes de l'apprentissage , 1987 .

[23]  Terrence J. Sejnowski,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cognitive Sciences.

[24]  Fernando J. Pineda,et al.  Dynamics and architecture for neural computation , 1988, J. Complex..

[25]  H. Spitzer,et al.  Increased attention enhances both behavioral and neuronal performance. , 1988, Science.

[26]  Richard A. Andersen,et al.  A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons , 1988, Nature.

[27]  D. G. Stork,et al.  Is backpropagation biologically plausible? , 1989, International 1989 Joint Conference on Neural Networks.

[28]  Kevan A. C. Martin,et al.  A Canonical Microcircuit for Neocortex , 1989, Neural Computation.

[29]  Francis Crick,et al.  The recent excitement about neural networks , 1989, Nature.

[30]  Michael I. Jordan,et al.  A more biologically plausible learning rule for neural networks. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Javier R. Movellan,et al.  Contrastive Hebbian Learning in the Continuous Hopfield Model , 1991 .

[32]  M. Mignard,et al.  Paths of information flow through visual cortex. , 1991, Science.

[33]  Marwan A. Jabri,et al.  Summed Weight Neuron Perturbation: An O(N) Improvement Over Weight Perturbation , 1992, NIPS.

[34]  J. Spall Multivariate stochastic approximation using a simultaneous perturbation gradient approximation , 1992 .

[35]  D. Nelkin In the name of genetics , 1993, Nature.

[36]  B. C. Motter Focal attention produces spatially selective processing in visual cortical areas V1, V2, and V4 in the presence of competing stimuli. , 1993, Journal of neurophysiology.

[37]  John Duncan,et al.  A neural basis for visual search in inferior temporal cortex , 1993, Nature.

[38]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[39]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[40]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[41]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[42]  J. Maunsell,et al.  Attentional modulation of visual motion processing in cortical areas MT and MST , 1996, Nature.

[43]  Wulfram Gerstner,et al.  A neuronal learning rule for sub-millisecond temporal coding , 1996, Nature.

[44]  Randall C. O'Reilly,et al.  Biologically Plausible Error-Driven Learning Using Local Activation Differences: The Generalized Recirculation Algorithm , 1996, Neural Computation.

[45]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[46]  Gaetan Libert,et al.  Emergence of clusters in the hidden layer of a dynamic recurrent neural network , 1997, Biological Cybernetics.

[47]  L. Abbott,et al.  Synaptic Depression and Cortical Gain Control , 1997, Science.

[48]  H. Markram,et al.  Regulation of Synaptic Efficacy by Coincidence of Postsynaptic APs and EPSPs , 1997, Science.

[49]  Dorothea Heiss-Czedik,et al.  An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[50]  R. Desimone,et al.  Neural mechanisms of spatial selective attention in areas V1, V2, and V4 of macaque visual cortex. , 1997, Journal of neurophysiology.

[51]  B. Sakmann,et al.  A new cellular mechanism for coupling inputs arriving at different cortical layers , 1999, Nature.

[52]  Carrie J. McAdams,et al.  Effects of Attention on Orientation-Tuning Functions of Single Neurons in Macaque Cortical Area V4 , 1999, The Journal of Neuroscience.

[53]  C. Gilbert,et al.  Attention Modulates Contextual Influences in the Primary Visual Cortex of Alert Monkeys , 1999, Neuron.

[54]  J Duncan,et al.  Responses of neurons in macaque area V4 during memory-guided visual search. , 2001, Cerebral cortex.

[55]  J. Bullier,et al.  Feedforward and feedback connections between areas V1 and V2 of the monkey have similar rapid conduction velocities. , 2001, Journal of neurophysiology.

[56]  Frank van der Velde,et al.  From artificial neural networks to spiking neuron populations and back again , 2001, Neural Networks.

[57]  R. Guillery,et al.  Thalamic Relay Functions and Their Role in Corticocortical Communication Generalizations from the Visual System , 2002, Neuron.

[58]  Frances S. Chance,et al.  Gain Modulation from Background Synaptic Input , 2002, Neuron.

[59]  G. Elston Cortex, cognition and the cell: new insights into the pyramidal neuron and prefrontal function. , 2003, Cerebral cortex.

[60]  Frank Tong,et al.  Cognitive neuroscience: Primary visual cortex and visual awareness , 2003, Nature Reviews Neuroscience.

[61]  Tai Sing Lee,et al.  Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[62]  S. Kosslyn,et al.  When is early visual cortex activated during visual mental imagery? , 2003, Psychological bulletin.

[63]  R. Desimone,et al.  Interacting Roles of Attention and Visual Salience in V4 , 2003, Neuron.

[64]  H. Seung,et al.  Learning in Spiking Neural Networks by Reinforcement of Stochastic Synaptic Transmission , 2003, Neuron.

[65]  Xiaohui Xie,et al.  Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network , 2003, Neural Computation.

[66]  E. Oztaş Neuronal tracing , 2003 .

[67]  Geoffrey E. Hinton The ups and downs of Hebb synapses. , 2003 .

[68]  S. Hochstein,et al.  The reverse hierarchy theory of visual perceptual learning , 2004, Trends in Cognitive Sciences.

[69]  C. Hansel,et al.  Bidirectional Parallel Fiber Plasticity in the Cerebellum under Climbing Fiber Control , 2004, Neuron.

[70]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[71]  Konrad P. Körding,et al.  Supervised and Unsupervised Learning with Two Sites of Synaptic Integration , 2001, Journal of Computational Neuroscience.

[72]  A. Burkhalter,et al.  Conserved patterns of cortico-cortical connections define areal hierarchy in rat visual cortex , 2004, Experimental Brain Research.

[73]  Bartlett W. Mel,et al.  Computational subunits in thin dendrites of pyramidal cells , 2004, Nature Neuroscience.

[74]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[75]  Xiaohui Xie,et al.  Learning Curves for Stochastic Gradient Descent in Linear Feedforward Networks , 2003, Neural Computation.

[76]  Urit Gordon,et al.  Plasticity Compartments in Basal Dendrites of Neocortical Pyramidal Neurons , 2006, The Journal of Neuroscience.

[77]  C. Gilbert,et al.  Contour Saliency in Primary Visual Cortex , 2006, Neuron.

[78]  P. J. Sjöström,et al.  A Cooperative Switch Determines the Sign of Synaptic Plasticity in Distal Dendrites of Neocortical Pyramidal Neurons , 2006, Neuron.

[79]  Judit K. Makara,et al.  Compartmentalized dendritic plasticity and input feature storage in neurons , 2008, Nature.

[80]  Chris Eliasmith,et al.  Solving the Problem of Negative Synaptic Weights in Cortical Models , 2008, Neural Computation.

[81]  Nicolas Pinto,et al.  Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[82]  P. Somogyi,et al.  Neuronal Diversity and Temporal Dynamics: The Unity of Hippocampal Circuit Operations , 2008, Science.

[83]  Timothy P. Lillicrap,et al.  Sensitivity Derivatives for Flexible Sensorimotor Learning , 2008, Neural Computation.

[84]  K. Harris Stability of the fittest: organizing learning through retroaxonal signals , 2008, Trends in Neurosciences.

[85]  J. Kwag,et al.  The timing of external input controls the sign of plasticity at local synapses , 2009, Nature Neuroscience.

[86]  M. Häusser,et al.  Dendritic Discrimination of Temporal Input Sequences in Cortical Neurons , 2010, Science.

[87]  B. Sakmann,et al.  Dimensions of a Projection Column and Architecture of VPM and POm Axons in Rat Vibrissal Cortex , 2010, Cerebral cortex.

[88]  L ChintaVenkateswararao,et al.  Adaptive optimal-control algorithms for brainlike networks. , 2010 .

[89]  Eero P. Simoncelli,et al.  Metamers of the ventral stream , 2011, Nature Neuroscience.

[90]  S. Sherman,et al.  Synaptic Properties of Corticocortical Connections between the Primary and Secondary Visual Cortical Areas in the Mouse , 2011, The Journal of Neuroscience.

[91]  M. Häusser,et al.  Synaptic Integration Gradients in Single Cortical Pyramidal Cell Dendrites , 2011, Neuron.

[92]  S. Sherman,et al.  Properties of the thalamic projection from the posterior medial nucleus to primary and secondary somatosensory cortices in the mouse , 2011, Proceedings of the National Academy of Sciences.

[93]  H. Bridge,et al.  Vivid visual mental imagery in the absence of the primary visual cortex , 2011, Journal of Neurology.

[94]  R W Guillery,et al.  Distinct functions for direct and transthalamic corticocortical connections. , 2011, Journal of neurophysiology.

[95]  Karl J. Friston,et al.  Canonical Microcircuits for Predictive Coding , 2012, Neuron.

[96]  T. Lillicrap,et al.  Preference Distributions of Primary Motor Cortex Neurons Reflect Control Solutions Optimized for Limb Biomechanics , 2013, Neuron.

[97]  Henry Kennedy,et al.  The importance of being hierarchical , 2013, Current Opinion in Neurobiology.

[98]  Jan Peters,et al.  A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[99]  C. Gilbert,et al.  Top-down influences on visual processing , 2013, Nature Reviews Neuroscience.

[100]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[101]  Erich Elsen,et al.  Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.

[102]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[103]  Yan Yang,et al.  Duration of complex-spikes grades Purkinje cell plasticity and cerebellar motor learning , 2014, Nature.

[104]  Yoshua Bengio,et al.  How Auto-Encoders Could Provide Credit Assignment in Deep Networks via Target Propagation , 2014, ArXiv.

[105]  Daniel L. K. Yamins,et al.  Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..

[106]  W. Senn,et al.  Learning by the Dendritic Prediction of Somatic Spiking , 2014, Neuron.

[107]  Daniel Cownden,et al.  Random feedback weights support learning in deep neural networks , 2014, ArXiv.

[108]  Allan R. Jones,et al.  A mesoscale connectome of the mouse brain , 2014, Nature.

[109]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[110]  Adam Binch,et al.  Perception as Bayesian Inference , 2014 .

[111]  S. Manita,et al.  A Top-Down Cortical Circuit for Accurate Sensory Perception , 2015, Neuron.

[112]  Alexander S. Ecker,et al.  Principles of connectivity among morphologically defined cell types in adult neocortex , 2015, Science.

[113]  Nikolaus Kriegeskorte,et al.  Deep neural networks: a new framework for modelling biological vision and brain information processing , 2015, bioRxiv.

[114]  Susumu Tonegawa,et al.  Conjunctive input processing drives feature selectivity in hippocampal CA1 neurons , 2015, Nature Neuroscience.

[115]  Yoshua Bengio,et al.  Difference Target Propagation , 2014, ECML/PKDD.

[116]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[117]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[118]  Colin J. Akerman,et al.  Random synaptic feedback weights support error backpropagation for deep learning , 2016, Nature Communications.

[119]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[120]  Yonghui Wu,et al.  Exploring the Limits of Language Modeling , 2016, ArXiv.

[121]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[122]  Shimon Ullman,et al.  Atoms of recognition in human and computer vision , 2016, Proceedings of the National Academy of Sciences.

[123]  Yoshua Bengio,et al.  Towards a Biologically Plausible Backprop , 2016, ArXiv.

[124]  Timothy P. Lillicrap,et al.  Deep learning with segregated dendrites , 2016 .

[125]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[126]  Georg B. Keller,et al.  Mismatch Receptive Fields in Mouse Visual Cortex , 2016, Neuron.

[127]  Walter Senn,et al.  Somato-dendritic Synaptic Plasticity and Error-backpropagation in Active Dendrites , 2016, PLoS Comput. Biol..

[128]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[129]  Joel Z. Leibo,et al.  How Important Is Weight Symmetry in Backpropagation? , 2015, AAAI.

[130]  Kevin Waugh,et al.  DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker , 2017, ArXiv.

[131]  R. D. D'Souza,et al.  A Laminar Organization for Selective Cortico-Cortical Communication , 2017, Front. Neuroanat..

[132]  Yoshua Bengio,et al.  Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation , 2016, Front. Comput. Neurosci..

[133]  Katie C. Bittner,et al.  Behavioral time scale synaptic plasticity underlies CA1 place fields , 2017, Science.

[134]  Timothy P Lillicrap,et al.  Deep Learning with Dynamic Spiking Neurons and Fixed Feedback Weights , 2017, Neural Computation.

[135]  Kevin Waugh,et al.  DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.

[136]  Eric T. Shea-Brown,et al.  Dynamic representation of partially occluded objects in primate prefrontal and visual cortex , 2017, eLife.

[137]  Rafal Bogacz,et al.  An Approximation of the Error Backpropagation Algorithm in a Predictive Coding Network with Local Hebbian Synaptic Plasticity , 2017, Neural Computation.

[138]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[139]  Timothy P Lillicrap,et al.  Towards deep learning with segregated dendrites , 2016, eLife.

[140]  Odelia Schwartz,et al.  Faculty of 1000 evaluation for Deep neural networks: A new framework for modeling biological vision and brain information processing. , 2017 .

[141]  Daniel L. K. Yamins,et al.  A Task-Optimized Neural Network Replicates Human Auditory Behavior, Predicts Brain Responses, and Reveals a Cortical Processing Hierarchy , 2018, Neuron.

[142]  Pieter R. Roelfsema,et al.  Control of synaptic plasticity in deep cortical networks , 2018, Nature Reviews Neuroscience.

[143]  Richard Naud,et al.  Sparse bursts optimize information transmission in a multiplexed neural code , 2018, Proceedings of the National Academy of Sciences.

[144]  Elias B. Issa,et al.  Neural dynamics at successive stages of the ventral visual stream are consistent with hierarchical error signals , 2018, eLife.

[145]  Aaron R. Seitz,et al.  Deep Neural Networks for Modeling Visual Perceptual Learning , 2018, The Journal of Neuroscience.

[146]  Yoshua Bengio,et al.  Dendritic error backpropagation in deep cortical microcircuits , 2017, ArXiv.

[147]  Allan R. Jones,et al.  Shared and distinct transcriptomic cell types across neocortical areas , 2018, Nature.

[148]  L. F. Abbott,et al.  Feedback alignment in deep convolutional networks , 2018, ArXiv.

[149]  Timothy Lillicrap,et al.  Using Weight Mirrors to Improve Feedback Alignment , 2019 .

[150]  Timothy P Lillicrap,et al.  Dendritic solutions to the credit assignment problem , 2019, Current Opinion in Neurobiology.

[151]  Tomaso A. Poggio,et al.  Biologically-plausible learning algorithms can scale to large datasets , 2018, ICLR.

[152]  James C. R. Whittington,et al.  Theories of Error Back-Propagation in the Brain , 2019, Trends in Cognitive Sciences.

[153]  Yali Amit,et al.  Deep Learning With Asymmetric Connections and Hebbian Updates , 2018, Front. Comput. Neurosci..

[154]  Lasse Becker-Czarnetzki Report on DeepStack Expert-Level Artificial Intelligence in Heads-Up No-Limit Poker , 2019 .

[155]  L. Abbott,et al.  Continual Learning in a Multi-Layer Network of an Electric Fish , 2019, Cell.

[156]  Leena E Williams,et al.  Higher-Order Thalamocortical Inputs Gate Synaptic Long-Term Potentiation via Disinhibition , 2018, Neuron.