Predictive Coding Can Do Exact Backpropagation on Any Neural Network

Intersecting neuroscience and deep learning has brought benefits and developments to both fields for several decades, which help to both understand how learning works in the brain, and to achieve the state-of-the-art performances in different AI benchmarks. Backpropagation (BP) is the most widely adopted method for the training of artificial neural networks, which, however, is often criticized for its biological implausibility (e.g., lack of local update rules for the parameters). Therefore, biologically plausible learning methods (e.g., inference learning (IL)) that rely on predictive coding (a framework for describing information processing in the brain) are increasingly studied. Recent works prove that IL can approximate BP up to a certain margin on multilayer perceptrons (MLPs), and asymptotically on any other complex model, and that zero-divergence inference learning (Z-IL), a variant of IL, is able to exactly implement BP on MLPs. However, the recent literature shows also that there is no biologically plausible method yet that can exactly replicate the weight update of BP on complex models. To fill this gap, in this paper, we generalize (IL and) Z-IL by directly defining them on computational graphs. To our knowledge, this is the first biologically plausible algorithm that is shown to be equivalent to BP in the way of updating parameters on any neural network, and it is thus a great breakthrough for the interdisciplinary research of neuroscience and deep learning.

[1]  Thomas Lukasiewicz,et al.  Predictive Coding Can Do Exact Backpropagation on Convolutional and Recurrent Neural Networks , 2021, ArXiv.

[2]  Beren Millidge,et al.  Predictive Coding Approximates Backprop Along Arbitrary Computation Graphs , 2020, Neural Computation.

[3]  T. Lillicrap,et al.  Backpropagation and the brain , 2020, Nature Reviews Neuroscience.

[4]  Xiaochao Dang,et al.  Supervised learning in spiking neural networks: A review of algorithms and evaluations , 2020, Neural Networks.

[5]  Claudia Clopath,et al.  Cortical credit assignment by Hebbian, neuromodulatory and inhibitory plasticity. , 2019, 1911.00307.

[6]  C. Pehlevan,et al.  Structured and Deep Similarity Matching via Structured and Deep Hebbian Networks , 2019, NeurIPS.

[7]  Timothy Lillicrap,et al.  Using Weight Mirrors to Improve Feedback Alignment , 2019 .

[8]  James C. R. Whittington,et al.  Theories of Error Back-Propagation in the Brain , 2019, Trends in Cognitive Sciences.

[9]  Wulfram Gerstner,et al.  Biologically plausible deep learning - but how far can we go with shallow networks? , 2019, Neural Networks.

[10]  Arild Nøkland,et al.  Training Neural Networks with Local Error Signals , 2019, ICML.

[11]  Yali Amit,et al.  Deep Learning With Asymmetric Connections and Hebbian Updates , 2018, Front. Comput. Neurosci..

[12]  Yoshua Bengio,et al.  Dendritic cortical microcircuits approximate the backpropagation algorithm , 2018, NeurIPS.

[13]  Tomaso A. Poggio,et al.  Biologically-plausible learning algorithms can scale to large datasets , 2018, ICLR.

[14]  Yoshua Bengio,et al.  Generalization of Equilibrium Propagation to Vector Field Dynamics , 2018, ArXiv.

[15]  Alexander Ororbia,et al.  Biologically Motivated Algorithms for Propagating Local Target Representations , 2018, AAAI.

[16]  Timothy Edward John Behrens,et al.  Generalisation of structural knowledge in the Hippocampal-Entorhinal system , 2018, NeurIPS.

[17]  Ping Tak Peter Tang,et al.  Dictionary Learning by Dynamical Neural Networks , 2018, ArXiv.

[18]  Razvan Pascanu,et al.  Vector-based navigation using grid-like representations in artificial agents , 2018, Nature.

[19]  Daniel L. K. Yamins,et al.  A Task-Optimized Neural Network Replicates Human Auditory Behavior, Predicts Brain Responses, and Reveals a Cortical Processing Hierarchy , 2018, Neuron.

[20]  A. Kitaoka,et al.  Illusory Motion Reproduced by Deep Neural Networks Trained for Prediction , 2018, Front. Psychol..

[21]  Pieter R. Roelfsema,et al.  Control of synaptic plasticity in deep cortical networks , 2018, Nature Reviews Neuroscience.

[22]  David Reitter,et al.  Learning to Adapt by Minimizing Discrepancy , 2017, ArXiv.

[23]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[24]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[25]  Philipp Sterzer,et al.  A predictive coding account of bistable perception - a model-based fMRI study , 2017, PLoS Comput. Biol..

[26]  Rafal Bogacz,et al.  A tutorial on the free-energy framework for modelling perception and learning , 2017, Journal of mathematical psychology.

[27]  Rafal Bogacz,et al.  An Approximation of the Error Backpropagation Algorithm in a Predictive Coding Network with Local Hebbian Synaptic Plasticity , 2017, Neural Computation.

[28]  Stewart Shipp,et al.  Neural Elements for Predictive Coding , 2016, Front. Psychol..

[29]  Colin J. Akerman,et al.  Random synaptic feedback weights support error backpropagation for deep learning , 2016, Nature Communications.

[30]  Arild Nøkland,et al.  Direct Feedback Alignment Provides Learning in Deep Neural Networks , 2016, NIPS.

[31]  Karl J. Friston,et al.  Repetition suppression and its contextual determinants in predictive coding , 2016, Cortex.

[32]  Gabriel Kreiman,et al.  Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning , 2016, ICLR.

[33]  J. DiCarlo,et al.  Using goal-driven deep learning models to understand sensory cortex , 2016, Nature Neuroscience.

[34]  Yoshua Bengio,et al.  Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation , 2016, Front. Comput. Neurosci..

[35]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[36]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Pierre Baldi,et al.  A theory of local learning, the learning channel, and the optimality of backpropagation , 2015, Neural Networks.

[38]  Karl J. Friston,et al.  Cerebral hierarchies: predictive processing, precision and the pulvinar , 2015, Philosophical Transactions of the Royal Society B: Biological Sciences.

[39]  Yoshua Bengio,et al.  Difference Target Propagation , 2014, ECML/PKDD.

[40]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[41]  Yoshua Bengio,et al.  How Auto-Encoders Could Provide Credit Assignment in Deep Networks via Target Propagation , 2014, ArXiv.

[42]  Daniel L. K. Yamins,et al.  Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..

[43]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[44]  T. Lillicrap,et al.  Preference Distributions of Primary Motor Cortex Neurons Reflect Control Solutions Optimized for Limb Biomechanics , 2013, Neuron.

[45]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[46]  Karl J. Friston,et al.  Canonical Microcircuits for Predictive Coding , 2012, Neuron.

[47]  Karl J. Friston,et al.  Attention, Uncertainty, and Free-Energy , 2010, Front. Hum. Neurosci..

[48]  Michael W. Spratling Reconciling Predictive Coding and Biased Competition Models of Cortical Function , 2008, Frontiers Comput. Neurosci..

[49]  Karl J. Friston,et al.  Predictive coding explains binocular rivalry: An epistemological review , 2008, Cognition.

[50]  Timothy P. Lillicrap,et al.  Sensitivity Derivatives for Flexible Sensorimotor Learning , 2008, Neural Computation.

[51]  Karl J. Friston,et al.  A theory of cortical responses , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[52]  Karl J. Friston Learning and inference in the brain , 2003, Neural Networks.

[53]  Xiaohui Xie,et al.  Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network , 2003, Neural Computation.

[54]  Konrad P. Körding,et al.  Supervised and Unsupervised Learning with Two Sites of Synaptic Integration , 2001, Journal of Computational Neuroscience.

[55]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[56]  Randall C. O'Reilly,et al.  Biologically Plausible Error-Driven Learning Using Local Activation Differences: The Generalized Recirculation Algorithm , 1996, Neural Computation.

[57]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[58]  Francis Crick,et al.  The recent excitement about neural networks , 1989, Nature.

[59]  Richard A. Andersen,et al.  A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons , 1988, Nature.

[60]  Stephen Grossberg,et al.  Competitive Learning: From Interactive Activation to Adaptive Resonance , 1987, Cogn. Sci..

[61]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[62]  Thomas Lukasiewicz,et al.  Can the Brain Do Backpropagation? - Exact Implementation of Backpropagation in Predictive Coding Networks , 2020, NeurIPS.

[63]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[64]  Rajesh P. N. Rao,et al.  Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. , 1999 .