Learning with reward prediction errors in a model of the Drosophila mushroom body

Effective decision making in a changing environment demands that accurate predictions are learned about decision outcomes. In Drosophila, such learning is or-chestrated in part by the mushroom body (MB), where dopamine neurons (DANs) signal reinforcing stimuli to modulate plasticity presynaptic to MB output neurons (MBONs). Here, we extend previous MB models, in which DANs signal absolute rewards, proposing instead that DANs signal reward prediction errors (RPEs) by utilising feedback reward predictions from MBONs. We formulate plasticity rules that minimise RPEs, and use simulations to verify that MBONs learn accurate reward predictions. We postulate as yet unobserved connectivity, which not only overcomes limitations in the experimentally constrained model, but also explains additional experimental observations that connect MB physiology to learning. The original, experimentally constrained model and the augmented model capture a broad range of established fly behaviours, and together make five predictions that can be tested using established experimental methods.

[1]  A. Fiala,et al.  Punishment Prediction by Dopaminergic Neurons in Drosophila , 2005, Current Biology.

[2]  R. Huerta,et al.  A Computational Framework for Understanding Decision Making through Integration of Basic Learning Rules , 2013, The Journal of Neuroscience.

[3]  Lars Chittka,et al.  A Simple Computational Model of the Bee Mushroom Body Can Explain Seemingly Complex Forms of Olfactory Learning and Memory , 2017, Current Biology.

[4]  P. Greengard,et al.  Writing Memories with Light-Addressable Reinforcement Circuitry , 2009, Cell.

[5]  G. Rubin,et al.  Mushroom body efferent neurons responsible for aversive olfactory memory retrieval in Drosophila , 2011, Nature Neuroscience.

[6]  Feng Li,et al.  The complete connectome of a learning and memory centre in an insect brain , 2017, Nature.

[7]  Johannes Felsenberg,et al.  Re-evaluation of learned information in Drosophila , 2017, Nature.

[8]  Oliver Barnstedt,et al.  Aversive Learning and Appetitive Motivation Toggle Feed-Forward Inhibition in the Drosophila Mushroom Body , 2016, Neuron.

[9]  S. Waddell Reinforcement signalling in Drosophila; dopamine does it all after all , 2013, Current Opinion in Neurobiology.

[10]  Eleni Vasilaki,et al.  Abstract concept learning in a simple neural network inspired by the insect brain , 2018, bioRxiv.

[11]  Vikram Chandra,et al.  Neural correlates of water reward in thirsty Drosophila , 2014, Nature Neuroscience.

[12]  Martin Heisenberg,et al.  Neural reorganization during metamorphosis of the corpora pedunculata in Drosophila melanogaster , 1982, Nature.

[13]  C. Galizia,et al.  Trace Conditioning in Drosophila Induces Associative Plasticity in Mushroom Body Kenyon Cells and Dopaminergic Neurons , 2017, Front. Neural Circuits.

[14]  Hiromu Tanimoto,et al.  Two pairs of mushroom body efferent neurons are required for appetitive long-term memory retrieval in Drosophila. , 2013, Cell reports.

[15]  Yoshinori Aso,et al.  Reward signal in a recurrent circuit drives appetitive long-term memory formation , 2015, eLife.

[16]  Barry J Dickson,et al.  Persistent activity in a recurrent circuit underlies courtship memory in Drosophila , 2018, eLife.

[17]  Daryl M. Gohl,et al.  Layered reward signaling through octopamine and dopamine in Drosophila , 2012, Nature.

[18]  W. Gerstner,et al.  Neuromodulated Spike-Timing-Dependent Plasticity, and Theory of Three-Factor Learning Rules , 2016, Front. Neural Circuits.

[19]  Glenn C. Turner,et al.  Olfactory representations by Drosophila mushroom body neurons. , 2008, Journal of neurophysiology.

[20]  Yoshinori Aso,et al.  Specific Dopaminergic Neurons for the Formation of Labile Aversive Memory , 2010, Current Biology.

[21]  Yoshinori Aso,et al.  Dopaminergic neurons write and update memories with cell-type-specific rules , 2016, eLife.

[22]  M. Heisenberg,et al.  Experimental psychology: Event timing turns punishment to reward , 2004, Nature.

[23]  Scott Waddell,et al.  Sweet Taste and Nutrient Value Subdivide Rewarding Dopaminergic Neurons in Drosophila , 2015, Current Biology.

[24]  G. Rubin,et al.  The neuronal architecture of the mushroom body provides a logic for associative learning , 2014, eLife.

[25]  G. Rubin,et al.  Mushroom body output neurons encode valence and guide memory-based action selection in Drosophila , 2014, eLife.

[26]  W. Schultz Neuronal Reward and Decision Signals: From Theories to Data. , 2015, Physiological reviews.

[27]  Sebastian T. Bundschuh,et al.  Optogenetic Dissection of Neuronal Circuits in Zebrafish using Viral Gene Transfer and the Tet System , 2009, Front. Neural Circuits.

[28]  Raphael Cohn,et al.  Coordinated and Compartmentalized Neuromodulation Shapes Sensory Processing in Drosophila , 2015, Cell.

[29]  R. Rescorla,et al.  A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[30]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[31]  Karel Svoboda,et al.  Stereotyped Odor-Evoked Activity in the Mushroom Body of Drosophila Revealed by Green Fluorescent Protein-Based Ca2+ Imaging , 2004, The Journal of Neuroscience.

[32]  G. Rubin,et al.  A subset of dopamine neurons signals reward for odour memory in Drosophila , 2012, Nature.

[33]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[34]  Andrew C. Lin,et al.  Sparse, Decorrelated Odor Coding in the Mushroom Body Enhances Learned Odor Discrimination , 2014, Nature Neuroscience.

[35]  A. H. Klopf,et al.  Brain Function and Adaptive Systems: A Heterostatic Theory , 1972 .

[36]  Paola Cognigni,et al.  Do the right thing: neural network mechanisms of memory formation, expression and update in Drosophila , 2018, Current Opinion in Neurobiology.

[37]  Rafal Bogacz,et al.  Learning the payoffs and costs of actions , 2018, bioRxiv.

[38]  Nobuhiro Yamagata,et al.  Suppression of Dopamine Neurons Mediates Reward , 2016, PLoS biology.

[39]  Pedro F. Jacob,et al.  Integration of Parallel Opposing Memories Underlies Memory Extinction , 2018, Cell.

[40]  Gerald M. Rubin,et al.  Heterosynaptic Plasticity Underlies Aversive Olfactory Learning in Drosophila , 2015, Neuron.

[41]  Johannes Felsenberg,et al.  Activity of Defined Mushroom Body Output Neurons Underlies Learned Olfactory Behavior in Drosophila , 2015, Neuron.

[42]  Louis K. Scheffer,et al.  A connectome of a learning and memory center in the adult Drosophila brain , 2017, eLife.

[43]  Yoshinori Aso,et al.  Distinct dopamine neurons mediate reward signals for short- and long-term memories , 2014, Proceedings of the National Academy of Sciences.

[44]  Ronald L. Davis,et al.  Frontiers in Neural Circuits Neural Circuits , 2022 .

[45]  Thomas Preat,et al.  Two independent mushroom body output circuits retrieve the six discrete components of Drosophila aversive memory. , 2015, Cell reports.

[46]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[47]  Ronald L. Davis,et al.  Reciprocal synapses between mushroom body and dopamine neurons form a positive feedback loop required for learning , 2017, eLife.

[48]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[49]  M. Heisenberg,et al.  Dopamine and Octopamine Differentiate between Aversive and Appetitive Olfactory Memories in Drosophila , 2003, The Journal of Neuroscience.