Predictive Coding: a Theoretical and Experimental Review

Predictive coding offers a potentially unifying account of cortical function – postulating that the core function of the brain is to minimize prediction errors with respect to a generative model of the world. The theory is closely related to the Bayesian brain framework and, over the last two decades, has gained substantial influence in the fields of theoretical and cognitive neuroscience. A large body of research has arisen based on both empirically testing improved and extended theoretical and mathematical models of predictive coding, as well as in evaluating their potential biological plausibility for implementation in the brain and the concrete neurophysiological and psychological predictions made by the theory. Despite this enduring popularity, however, no comprehensive review of predictive coding theory, and especially of recent developments in this field, exists. Here, we provide a comprehensive review both of the core mathematical structure and logic of predictive coding, thus complementing recent tutorials in the literature (Bogacz, 2017; Buckley, Kim, McGregor, & Seth, 2017). We also review a wide range of classic and recent work within the framework, ranging from the neurobiologically realistic microcircuits that could implement predictive coding, to the close relationship between predictive coding and the widely-used backpropagation of error algorithm, as well as surveying the close relationships between predictive coding and modern machine learning techniques.

[1]  M. Opper,et al.  Advanced mean field methods: theory and practice , 2001 .

[2]  Colin J. Akerman,et al.  Random synaptic feedback weights support error backpropagation for deep learning , 2016, Nature Communications.

[3]  D. Knill,et al.  The Bayesian brain: the role of uncertainty in neural coding and computation , 2004, Trends in Neurosciences.

[4]  L. Riggs,et al.  The disappearance of steadily fixated visual test objects. , 1953, Journal of the Optical Society of America.

[5]  A. Seth Interoceptive inference, emotion, and the embodied self , 2013, Trends in Cognitive Sciences.

[6]  Charles Elkan,et al.  Expectation Maximization Algorithm , 2010, Encyclopedia of Machine Learning.

[7]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[8]  Maya R. Gupta,et al.  Theory and Use of the EM Algorithm , 2011, Found. Trends Signal Process..

[9]  H. B. Barlow,et al.  Possible Principles Underlying the Transformations of Sensory Messages , 2012 .

[10]  Pierre Baldi,et al.  Bayesian surprise attracts human attention , 2005, Vision Research.

[11]  A. Clark Whatever next? Predictive brains, situated agents, and the future of cognitive science. , 2013, The Behavioral and brain sciences.

[12]  David Mumford,et al.  On the computational architecture of the neocortex , 2004, Biological Cybernetics.

[13]  Stewart Shipp,et al.  Neural Elements for Predictive Coding , 2016, Front. Psychol..

[14]  R. Desimone,et al.  Competitive Mechanisms Subserve Attention in Macaque Areas V2 and V4 , 1999, The Journal of Neuroscience.

[15]  Jeff Orchard,et al.  Making Predictive Coding Networks Generative , 2019, ArXiv.

[16]  Yoshua Bengio,et al.  Generalization of Equilibrium Propagation to Vector Field Dynamics , 2018, ArXiv.

[17]  Manuel Baltieri,et al.  PID Control as a Process of Active Inference with Linear Generative Models † , 2019, Entropy.

[18]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[19]  Yoshua Bengio,et al.  Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation , 2016, Front. Comput. Neurosci..

[20]  Karl J. Friston Hierarchical Models in the Brain , 2008, PLoS Comput. Biol..

[21]  Karl J. Friston,et al.  Active interoceptive inference and the emotional brain , 2016, Philosophical Transactions of the Royal Society B: Biological Sciences.

[22]  M. Larkum,et al.  Active dendritic currents gate descending cortical outputs in perception , 2020, Nature Neuroscience.

[23]  Terrence J. Sejnowski,et al.  Unsupervised Learning , 2018, Encyclopedia of GIS.

[24]  Alexander Ororbia,et al.  Biologically Motivated Algorithms for Propagating Local Target Representations , 2018, AAAI.

[25]  Jacques Kaiser,et al.  Synaptic Plasticity Dynamics for Deep Continuous Local Learning (DECOLLE) , 2018, Frontiers in Neuroscience.

[26]  H J Gerrits,et al.  Artificial movements of a stabilized image. , 1970, Vision research.

[27]  Rajesh P. N. Rao,et al.  Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. , 1999 .

[28]  R. Stengel Stochastic Optimal Control: Theory and Application , 1986 .

[29]  Kevan A. C. Martin,et al.  Whose Cortical Column Would that Be? , 2010, Front. Neuroanat..

[30]  Ryota Kanai,et al.  Deep learning and the Global Workspace Theory , 2020, Trends in Neurosciences.

[31]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[32]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[33]  Karl J. Friston The free-energy principle: a unified brain theory? , 2010, Nature Reviews Neuroscience.

[34]  Rafal Bogacz,et al.  A tutorial on the free-energy framework for modelling perception and learning , 2017, Journal of mathematical psychology.

[35]  Rafal Bogacz,et al.  An Approximation of the Error Backpropagation Algorithm in a Predictive Coding Network with Local Hebbian Synaptic Plasticity , 2017, Neural Computation.

[36]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[37]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[38]  Jeroen J. A. van Boxtel,et al.  A predictive coding perspective on autism spectrum disorders , 2013, Front. Psychology.

[39]  Magdy Bayoumi,et al.  Reduced-Gate Convolutional LSTM Architecture for Next-Frame Video Prediction Using Predictive Coding , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[40]  Beren Millidge Implementing Predictive Processing and Active Inference: Preliminary Steps and Results , 2019 .

[41]  S. Laughlin,et al.  Predictive coding: a fresh view of inhibition in the retina , 1982, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[42]  Niraj S. Desai,et al.  Homeostatic Plasticity and STDP: Keeping a Neuron's Cool in a Fluctuating World , 2010, Front. Syn. Neurosci..

[43]  Matthew H Tong,et al.  SUN: Top-down saliency using natural statistics , 2009, Visual cognition.

[44]  Friedemann Zenke,et al.  The remarkable robustness of surrogate gradient learning for instilling complex function in spiking neural networks , 2020, bioRxiv.

[45]  Gabriel Kreiman,et al.  Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning , 2016, ICLR.

[46]  H. Critchley,et al.  Extending predictive processing to the body: emotion as interoceptive inference. , 2013, The Behavioral and brain sciences.

[47]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[48]  Georg B. Keller,et al.  Predictive Processing: A Canonical Cortical Computation , 2018, Neuron.

[49]  M. Frank,et al.  Computational psychiatry as a bridge from neuroscience to clinical applications , 2016, Nature Neuroscience.

[50]  Michael W. Spratling Reconciling Predictive Coding and Biased Competition Models of Cortical Function , 2008, Frontiers Comput. Neurosci..

[51]  Matin Hosseini,et al.  Hierarchical Predictive Coding Models in a Deep-Learning Framework , 2020, ArXiv.

[52]  Murray Shanahan,et al.  A predictive processing model of episodic memory and time perception , 2020, bioRxiv.

[53]  Wieland Brendel,et al.  Learning to represent signals spike by spike , 2017, PLoS Comput. Biol..

[54]  R. Prager,et al.  Experiments with Simple Hebbian-based Learning Rules in Pattern Classification Tasks , 2007 .

[55]  Wasserman,et al.  Bayesian Model Selection and Model Averaging. , 2000, Journal of mathematical psychology.

[56]  Anil K. Seth,et al.  Reinforcement Learning through Active Inference , 2020, ArXiv.

[57]  J. Henderson Gaze Control as Prediction , 2017, Trends in Cognitive Sciences.

[58]  R. Feynman Statistical Mechanics, A Set of Lectures , 1972 .

[59]  James C. R. Whittington,et al.  Theories of Error Back-Propagation in the Brain , 2019, Trends in Cognitive Sciences.

[60]  Kathleen S. Rockland,et al.  What do we know about laminar connectivity? , 2017, NeuroImage.

[61]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[62]  Beren Millidge,et al.  Investigating the Scalability and Biological Plausibility of the Activation Relaxation Algorithm , 2020, ArXiv.

[63]  Tom Minka,et al.  A family of algorithms for approximate Bayesian inference , 2001 .

[64]  Eric Nalisnick,et al.  Normalizing Flows for Probabilistic Modeling and Inference , 2019, J. Mach. Learn. Res..

[65]  W. Singer,et al.  Neural Synchrony in Brain Disorders: Relevance for Cognitive Dysfunctions and Pathophysiology , 2006, Neuron.

[66]  Derrick J. Parkhurst,et al.  Modeling the role of salience in the allocation of overt visual attention , 2002, Vision Research.

[67]  Manuel Baltieri A Bayesian perspective on classical control , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[68]  Paul J. Werbos,et al.  Applications of advances in nonlinear sensitivity analysis , 1982 .

[69]  W. Ashby,et al.  Every Good Regulator of a System Must Be a Model of That System , 1970 .

[70]  Karl J. Friston,et al.  Active Inference: A Process Theory , 2017, Neural Computation.

[71]  Christopher L. Buckley,et al.  A Probabilistic Interpretation of PID Controllers Using Active Inference , 2018, SAB.

[72]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[73]  Thomas Lukasiewicz,et al.  Can the Brain Do Backpropagation? - Exact Implementation of Backpropagation in Predictive Coding Networks , 2020, NeurIPS.

[74]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[75]  Yoshua Bengio,et al.  Early Inference in Energy-Based Models Approximates Back-Propagation , 2015, ArXiv.

[76]  Lars Muckli,et al.  The Predictive Coding Account of Psychosis , 2018, Biological Psychiatry.

[77]  Karl J. Friston,et al.  Predictions not commands: active inference in the motor system , 2012, Brain Structure and Function.

[78]  H. Critchley,et al.  An Interoceptive Predictive Coding Model of Conscious Presence , 2011, Front. Psychology.

[79]  Philipp Sterzer,et al.  A predictive coding account of bistable perception - a model-based fMRI study , 2017, PLoS Comput. Biol..

[80]  Yann Ollivier,et al.  Unbiased Online Recurrent Optimization , 2017, ICLR.

[81]  Peter C. Humphreys,et al.  Deep Learning without Weight Transport , 2019, NeurIPS.

[82]  Karl J. Friston,et al.  Predictive coding explains binocular rivalry: An epistemological review , 2008, Cognition.

[83]  D Purves,et al.  The extraordinarily rapid disappearance of entoptic images. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[84]  R. Weale Vision. A Computational Investigation Into the Human Representation and Processing of Visual Information. David Marr , 1983 .

[85]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[86]  Karl J. Friston,et al.  Reinforcement Learning or Active Inference? , 2009, PloS one.

[87]  Nikola T. Markov,et al.  Anatomy of hierarchy: Feedforward and feedback pathways in macaque visual cortex , 2013, The Journal of comparative neurology.

[88]  F. H. Adler Cybernetics, or Control and Communication in the Animal and the Machine. , 1949 .

[89]  Karl J. Friston,et al.  Repetition suppression and its contextual determinants in predictive coding , 2016, Cortex.

[90]  G. Pezzulo,et al.  Simulating homeostatic, allostatic and goal-directed forms of interoceptive control using active inference , 2021, Biological Psychology.

[91]  Michael W. Spratling A review of predictive coding algorithms , 2017, Brain and Cognition.

[92]  Joseph Marino,et al.  Predictive Coding, Variational Autoencoders, and Biological Connections , 2019, Neural Computation.

[93]  Beren Millidge,et al.  Predictive Coding Approximates Backprop Along Arbitrary Computation Graphs , 2020, Neural Computation.

[94]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[95]  P. Kok On the role of expectation in visual perception: A top-down view of early visual cortex , 2015 .

[96]  A. Kitaoka,et al.  Illusory Motion Reproduced by Deep Neural Networks Trained for Prediction , 2018, Front. Psychol..

[97]  Yoshua Bengio,et al.  Dendritic cortical microcircuits approximate the backpropagation algorithm , 2018, NeurIPS.

[98]  Wolfgang Maass,et al.  A solution to the learning dilemma for recurrent networks of spiking neurons , 2019, Nature Communications.

[99]  Andrzej Cichocki,et al.  Families of Alpha- Beta- and Gamma- Divergences: Flexible and Robust Measures of Similarities , 2010, Entropy.

[100]  Beren Millidge Combining Active Inference and Hierarchical Predictive Coding: A Tutorial Introduction and Case Study , 2019 .

[101]  Nikolaus Kriegeskorte,et al.  Recurrence is required to capture the representational dynamics of the human visual system , 2019, Proceedings of the National Academy of Sciences.

[102]  Andrew Gelman,et al.  Handbook of Markov Chain Monte Carlo , 2011 .

[103]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[104]  Michael W. Spratling Predictive Coding as a Model of Response Properties in Cortical Area V1 , 2010, The Journal of Neuroscience.

[105]  Christian K. Machens,et al.  Predictive Coding of Dynamical Variables in Balanced Spiking Networks , 2013, PLoS Comput. Biol..

[106]  Karl J. Friston,et al.  Neuronal message passing using Mean-field, Bethe, and Marginal approximations , 2019, Scientific Reports.

[107]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[108]  D. J. Felleman,et al.  Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.

[109]  Shuang-Nan Zhang On the Solution to the , 2010 .

[110]  Guillaume Bouchard,et al.  The Tradeoff Between Generative and Discriminative Classifiers , 2004 .

[111]  Adam Santoro,et al.  Backpropagation and the brain , 2020, Nature Reviews Neuroscience.

[112]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[113]  M. Chait,et al.  Great expectations: Is there evidence for predictive coding in auditory cortex? , 2017, Neuroscience.

[114]  Christopher L. Buckley,et al.  On Kalman-Bucy filters, linear quadratic control and active inference , 2020, 2005.06269.

[115]  Hassana K. Oyibo,et al.  Experience-dependent spatial expectations in mouse visual cortex , 2016, Nature Neuroscience.

[116]  David P. McGovern,et al.  Evaluating the neurophysiological evidence for predictive processing as a model of perception , 2020, Annals of the New York Academy of Sciences.

[117]  The Philosophy and Science of Predictive Processing , 2020 .

[118]  Florian Nadel,et al.  Stochastic Processes And Filtering Theory , 2016 .

[119]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[120]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[121]  Sophie Denève,et al.  Bayesian Inference with Spiking Neurons , 2004, Encyclopedia of Computational Neuroscience.

[122]  Geoffrey E. Hinton,et al.  Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.

[123]  Karl J. Friston,et al.  Action and behavior: a free-energy formulation , 2010, Biological Cybernetics.

[124]  Karl J. Friston,et al.  Cerebral hierarchies: predictive processing, precision and the pulvinar , 2015, Philosophical Transactions of the Royal Society B: Biological Sciences.

[125]  Beren Millidge,et al.  Relaxing the Constraints on Predictive Coding Models , 2020, ArXiv.

[126]  Kelvin E. Jones,et al.  Neuronal variability: noise or part of the signal? , 2005, Nature Reviews Neuroscience.

[127]  Lydia Ng,et al.  The organization of intracortical connections by layer and cell class in the mouse brain , 2018, bioRxiv.

[128]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[129]  Karl J. Friston,et al.  A theory of cortical responses , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[130]  Rewon Child Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images , 2021, ICLR.

[131]  Ilya Sutskever,et al.  Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[132]  Floris P. de Lange,et al.  Predictive Coding in Sensory Cortex , 2015 .

[133]  R. Cao New Labels for Old Ideas: Predictive Processing and the Interpretation of Neural Signals , 2020, Review of Philosophy and Psychology.

[134]  Alex M. Thomson,et al.  Neocortical Layer 6, A Review , 2010, Front. Neuroanat..

[135]  Karl J. Friston Learning and inference in the brain , 2003, Neural Networks.

[136]  Simon McGregor,et al.  The free energy principle for action and perception: A mathematical review , 2017, 1705.09156.

[137]  Hesham Mostafa,et al.  Surrogate Gradient Learning in Spiking Neural Networks: Bringing the Power of Gradient-based optimization to spiking neural networks , 2019, IEEE Signal Processing Magazine.

[138]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[139]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[140]  Thomas Lukasiewicz,et al.  Predictive Coding Can Do Exact Backpropagation on Convolutional and Recurrent Neural Networks , 2021, ArXiv.

[141]  Karl J. Friston,et al.  A free energy principle for the brain , 2006, Journal of Physiology-Paris.

[142]  Karl J. Friston,et al.  DEM: A variational treatment of dynamic systems , 2008, NeuroImage.

[143]  Karl J. Friston,et al.  Hierarchical Active Inference: A Theory of Motivated Control , 2018, Trends in Cognitive Sciences.

[144]  R. Desimone Visual attention mediated by biased competition in extrastriate visual cortex. , 1998, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[145]  Yoshua Bengio,et al.  STDP-Compatible Approximation of Backpropagation in an Energy-Based Model , 2017, Neural Computation.

[146]  Karl J. Friston,et al.  Active Inference, epistemic value, and vicarious trial and error , 2016, Learning & memory.

[147]  Naoki Kogo,et al.  Is predictive coding theory articulated enough to be testable? , 2015, Front. Comput. Neurosci..

[148]  Karl J. Friston,et al.  Human Neuroscience Hypothesis and Theory Article an Aberrant Precision Account of Autism , 2022 .

[149]  Demis Hassabis,et al.  Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.

[150]  Surya Ganguli,et al.  SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks , 2017, Neural Computation.

[151]  G. Kanizsa Margini Quasi-percettivi in Campi con Stimolazione Omogenea , 1955 .

[152]  Karl J. Friston,et al.  Reflections on agranular architecture: predictive coding in the motor cortex , 2013, Trends in Neurosciences.

[153]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[154]  Francis Crick,et al.  The recent excitement about neural networks , 1989, Nature.

[155]  Ronald J. Williams,et al.  Experimental Analysis of the Real-time Recurrent Learning Algorithm , 1989 .

[156]  Beren Millidge,et al.  Activation Relaxation: A Local Dynamical Approximation to Backpropagation in the Brain , 2020, ArXiv.

[157]  R. A. Boyles On the Convergence of the EM Algorithm , 1983 .

[158]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[159]  W. K. Simmons,et al.  Interoceptive predictions in the brain , 2015, Nature Reviews Neuroscience.

[160]  T. Moore,et al.  Neural Mechanisms of Selective Visual Attention. , 2017, Annual review of psychology.

[161]  Karl J. Friston,et al.  Canonical Microcircuits for Predictive Coding , 2012, Neuron.

[162]  Yoshua Bengio,et al.  Difference Target Propagation , 2014, ECML/PKDD.

[163]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[164]  Yali Amit,et al.  Deep Learning With Asymmetric Connections and Hebbian Updates , 2018, Front. Comput. Neurosci..

[165]  M. Nour Surfing Uncertainty: Prediction, Action, and the Embodied Mind. , 2017, British Journal of Psychiatry.

[166]  Shun-ichi Amari,et al.  Information geometry of the EM and em algorithms for neural networks , 1995, Neural Networks.

[167]  U. Alon An introduction to systems biology : design principles of biological circuits , 2019 .

[168]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[169]  Eugenio Culurciello,et al.  Deep Predictive Coding Network for Object Recognition , 2018, ICML.

[170]  M. Carandini,et al.  Normalization as a canonical neural computation , 2011, Nature Reviews Neuroscience.

[171]  G. Buzsáki Rhythms of the brain , 2006 .

[172]  Daniel Williams,et al.  Predictive Processing and the Representation Wars , 2017, Minds and Machines.

[173]  E. Schuman,et al.  Role for a cortical input to hippocampal area CA1 in the consolidation of a long-term memory , 2004, Nature.

[174]  H. Kennedy,et al.  Visual Areas Exert Feedforward and Feedback Influences through Distinct Frequency Channels , 2014, Neuron.

[175]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[176]  G. Turrigiano Homeostatic plasticity in neuronal networks: the more things change, the more they stay the same , 1999, Trends in Neurosciences.

[177]  Karl J. Friston,et al.  Active inference and epistemic value , 2015, Cognitive neuroscience.

[178]  K. Grill-Spector,et al.  The human visual cortex. , 2004, Annual review of neuroscience.