Multiple Dopamine Systems: Weal and Woe of Dopamine.

The ability to predict future outcomes increases the fitness of the animal. Decades of research have shown that dopamine neurons broadcast reward prediction error (RPE) signals-the discrepancy between actual and predicted reward-to drive learning to predict future outcomes. Recent studies have begun to show, however, that dopamine neurons are more diverse than previously thought. In this review, we will summarize a series of our studies that have shown unique properties of dopamine neurons projecting to the posterior "tail" of the striatum (TS) in terms of anatomy, activity, and function. Specifically, TS-projecting dopamine neurons are activated by a subset of negative events including threats from a novel object, send prediction errors for external threats, and reinforce avoidance behaviors. These results indicate that there are at least two axes of dopamine-mediated reinforcement learning in the brain-one learning from canonical RPEs and another learning from threat prediction errors. We argue that the existence of multiple learning systems is an adaptive strategy that makes possible each system optimized for its own needs. The compartmental organization in the mammalian striatum resembles that of a dopamine-recipient area in insects (mushroom body), pointing to a principle of dopamine function conserved across phyla.

[1]  Carol A. Seger,et al.  The visual corticostriatal loop through the tail of the caudate: circuitry and function , 2013, Front. Syst. Neurosci..

[2]  W. Schultz,et al.  Importance of unpredictability for reward responses in primate dopamine neurons. , 1994, Journal of neurophysiology.

[3]  John R. C Christensen,et al.  Regional and temporal differences in real-time dopamine efflux in the nucleus accumbens during free-choice novelty , 1997, Brain Research.

[4]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[5]  Simon Hong,et al.  A pallidus-habenula-dopamine pathway signals inferred stimulus values. , 2010, Journal of neurophysiology.

[6]  Tianyi Mao,et al.  A comprehensive excitatory input map of the striatum reveals novel functional organization , 2016, eLife.

[7]  Scott Waddell,et al.  Olfactory learning skews mushroom body output pathways to steer behavioral choice in Drosophila , 2015, Current Opinion in Neurobiology.

[8]  Ethan S. Bromberg-Martin,et al.  Multiple Timescales of Memory in Lateral Habenula and Dopamine Neurons , 2010, Neuron.

[9]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[10]  J. Roeper Dissecting the diversity of midbrain dopamine neurons , 2013, Trends in Neurosciences.

[11]  N. Uchida,et al.  Opposite initialization to novel cues in dopamine signaling in ventral and posterior striatum in mice , 2016, eLife.

[12]  W. K. Honig,et al.  Fundamental issues in associative learning : proceedings of a symposium held at Dalhousie University, Halifax, June 1968 , 1969 .

[13]  R. Wightman,et al.  Real-time chemical responses in the nucleus accumbens differentiate rewarding and aversive stimuli , 2008, Nature Neuroscience.

[14]  Jeremiah Y. Cohen,et al.  Serotonergic neurons signal reward and punishment on multiple timescales , 2015, eLife.

[15]  Yoshinori Aso,et al.  Distinct dopamine neurons mediate reward signals for short- and long-term memories , 2014, Proceedings of the National Academy of Sciences.

[16]  Ian R. Wickersham,et al.  Monosynaptic Restriction of Transsynaptic Tracing from Single, Genetically Targeted Neurons , 2007, Neuron.

[17]  N. Mercuri,et al.  Two cell types in rat substantia nigra zona compacta distinguished by membrane properties and the actions of dopamine and opioids , 1989, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[18]  Scott Waddell,et al.  Sweet Taste and Nutrient Value Subdivide Rewarding Dopaminergic Neurons in Drosophila , 2015, Current Biology.

[19]  M. Howe,et al.  Rapid signaling in distinct dopaminergic axons during locomotion and reward , 2016, Nature.

[20]  K. Deisseroth,et al.  Phasic Firing in Dopaminergic Neurons Is Sufficient for Behavioral Conditioning , 2009, Science.

[21]  A. Grace,et al.  Are you or aren’t you? Challenges associated with physiologically identifying dopamine neurons , 2012, Trends in Neurosciences.

[22]  Raag D. Airan,et al.  Natural Neural Projection Dynamics Underlying Social Behavior , 2014, Cell.

[23]  Xin Jin,et al.  Start/stop signals emerge in nigrostriatal circuits during sequence learning , 2010, Nature.

[24]  A. Ogura,et al.  A single optical fiber fluorometric device for measurement of intracellular Ca2+ concentration: Its application to hippocampal neurons in vitro and in vivo , 1992, Neuroscience.

[25]  Benjamin T. Saunders,et al.  Dopamine neurons create Pavlovian conditioned stimuli with circuit-defined motivational properties , 2018, Nature Neuroscience.

[26]  James L Olds,et al.  Positive reinforcement produced by electrical stimulation of septal area and other regions of rat brain. , 1954, Journal of comparative and physiological psychology.

[27]  Philipp J. Keller,et al.  Fast, high-contrast imaging of animal development with scanned light sheet–based structured-illumination microscopy , 2010, Nature Methods.

[28]  Gerald Tesauro,et al.  Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..

[29]  E. Oleson,et al.  Subsecond Dopamine Release in the Nucleus Accumbens Predicts Conditioned Punishment and Its Successful Avoidance , 2012, The Journal of Neuroscience.

[30]  Ilana B. Witten,et al.  Recombinase-Driver Rat Lines: Tools, Techniques, and Optogenetic Application to Dopamine-Mediated Reinforcement , 2011, Neuron.

[31]  Nicholas N. Foster,et al.  The mouse cortico-striatal projectome , 2016, Nature Neuroscience.

[32]  Anne E Carpenter,et al.  Neuron-type specific signals for reward and punishment in the ventral tegmental area , 2011, Nature.

[33]  S. Gershman,et al.  Dopamine reward prediction errors reflect hidden state inference across time , 2017, Nature Neuroscience.

[34]  Z. Mainen,et al.  Activity patterns of serotonin neurons underlying cognitive flexibility , 2017, eLife.

[35]  Naoshige Uchida,et al.  Erratum: Arithmetic and local circuitry underlying dopamine prediction errors , 2015, Nature.

[36]  Geoffrey Schoenbaum,et al.  Rethinking dopamine as generalized prediction error , 2018, bioRxiv.

[37]  N. Uchida,et al.  Neural Circuitry of Reward Prediction Error. , 2017, Annual review of neuroscience.

[38]  P. Janak,et al.  Ventral Tegmental Dopamine Neurons Participate in Reward Identity Predictions , 2019, Current Biology.

[39]  P. Glimcher,et al.  Phasic Dopamine Release in the Rat Nucleus Accumbens Symmetrically Encodes a Reward Prediction Error Term , 2014, The Journal of Neuroscience.

[40]  R. Costa,et al.  Dopamine neuron activity before action initiation gates and invigorates future movements , 2018, Nature.

[41]  B. Jacobs,et al.  Behavioral correlates of dopaminergic unit activity in freely moving cats , 1983, Brain Research.

[42]  G. Rubin,et al.  A subset of dopamine neurons signals reward for odour memory in Drosophila , 2012, Nature.

[43]  S. Gershman,et al.  The Medial Prefrontal Cortex Shapes Dopamine Reward Prediction Errors under State Uncertainty , 2018, Neuron.

[44]  P. Janak,et al.  Establishing causality for dopamine in neural function and behavior with optogenetics , 2013, Brain Research.

[45]  Ilana B. Witten,et al.  Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target , 2016, Nature Neuroscience.

[46]  Talia N. Lerner,et al.  Intact-Brain Analyses Reveal Distinct Information Carried by SNc Dopamine Subcircuits , 2015, Cell.

[47]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[48]  J. Horvitz,et al.  Burst activity of ventral tegmental dopamine neurons is elicited by sensory stimuli in the awake cat , 1997, Brain Research.

[49]  J. Grimm,et al.  Molecular basis for catecholaminergic neuron diversity. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[50]  Christina K. Kim,et al.  A Neural Circuit Mechanism for Encoding Aversive Stimuli in the Mesolimbic Dopamine System , 2019, Neuron.

[51]  N. Uchida,et al.  Dopamine neurons projecting to the posterior striatum reinforce avoidance of threatening stimuli , 2018, Nature Neuroscience.

[52]  Sachie K. Ogawa,et al.  Whole-Brain Mapping of Direct Inputs to Midbrain Dopamine Neurons , 2012, Neuron.

[53]  Joseph J. Paton,et al.  The many worlds hypothesis of dopamine prediction error: implications of a parallel circuit architecture in the basal ganglia , 2017, Current Opinion in Neurobiology.

[54]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[55]  Raphael Cohn,et al.  Coordinated and Compartmentalized Neuromodulation Shapes Sensory Processing in Drosophila , 2015, Cell.

[56]  R. Rescorla,et al.  A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[57]  H. Nakahara Multiplexing signals in reinforcement learning with internal models and dopamine , 2014, Current Opinion in Neurobiology.

[58]  William R. Stauffer,et al.  Dopamine neurons learn relative chosen value from probabilistic rewards , 2016, eLife.

[59]  S. Lammel,et al.  Unique Properties of Mesoprefrontal Neurons within a Dual Mesocorticolimbic Dopamine System , 2008, Neuron.

[60]  S. Gershman,et al.  Belief state representation in the dopamine system , 2018, Nature Communications.

[61]  R. Wise Dopamine, learning and motivation , 2004, Nature Reviews Neuroscience.

[62]  Talia N. Lerner,et al.  Simultaneous fast measurement of circuit dynamics at multiple sites across the mammalian brain , 2016, Nature Methods.

[63]  K. Deisseroth,et al.  Millisecond-timescale, genetically targeted optical control of neural activity , 2005, Nature Neuroscience.

[64]  Joseph W. Barter,et al.  Beyond reward prediction errors: the role of dopamine in movement kinematics , 2015, Front. Integr. Neurosci..

[65]  D. Spalding The Principles of Psychology , 1873, Nature.

[66]  W. Schultz,et al.  Responses of monkey dopamine neurons during learning of behavioral reactions. , 1992, Journal of neurophysiology.

[67]  W. Schultz Updating dopamine reward signals , 2013, Current Opinion in Neurobiology.

[68]  Daryl M. Gohl,et al.  Layered reward signaling through octopamine and dopamine in Drosophila , 2012, Nature.

[69]  Philippe Mailly,et al.  The Rat Prefrontostriatal System Analyzed in 3D: Evidence for Multiple Interacting Functional Units , 2013, The Journal of Neuroscience.

[70]  F. Cicchetti,et al.  Defining midbrain dopaminergic neuron diversity by single-cell gene expression profiling. , 2014, Cell reports.

[71]  Santiago Jaramillo,et al.  Stable representation of sounds in the posterior striatum during flexible auditory decisions , 2017, bioRxiv.

[72]  O. Hikosaka,et al.  Two types of dopamine neuron distinctly convey positive and negative motivational signals , 2009, Nature.

[73]  Ali Ghazizadeh,et al.  Dopamine Neurons Encoding Long-Term Memory of Object Value for Habitual Behavior , 2015, Cell.

[74]  Aaron S. Andalman,et al.  Structural and molecular interrogation of intact biological systems , 2013, Nature.

[75]  Naoshige Uchida,et al.  Habenula Lesions Reveal that Multiple Mechanisms Underlie Dopamine Prediction Errors , 2015, Neuron.

[76]  R. Joosten,et al.  Reward-Predictive Cues Enhance Excitatory Synaptic Strength onto Midbrain Dopamine Neurons , 2008, Science.

[77]  S. Waddell Neural Plasticity: Dopamine Tunes the Mushroom Body Output Network , 2016, Current Biology.

[78]  Susana Q. Lima,et al.  PINP: A New Method of Tagging Neuronal Populations for Identification during In Vivo Electrophysiological Recording , 2009, PloS one.

[79]  T. Robinson,et al.  A selective role for dopamine in reward learning , 2010, Nature.

[80]  Ilana B. Witten,et al.  Specialized and spatially organized coding of sensory, motor, and cognitive variables in midbrain dopamine neurons , 2018 .

[81]  Sachie K. Ogawa,et al.  Dopamine neurons projecting to the posterior striatum form an anatomically distinct subclass , 2015, eLife.

[82]  P. Goldman-Rakic,et al.  Longitudinal topography and interdigitation of corticostriatal projections in the rhesus monkey , 1985, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[83]  P. Glimcher,et al.  Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[84]  N. Uchida,et al.  Midbrain dopamine neurons signal aversion in a reward-context-dependent manner , 2016, eLife.

[85]  A. Zador,et al.  Selective corticostriatal plasticity during acquisition of an auditory discrimination task , 2014, Nature.

[86]  Peter Dayan,et al.  Dopamine: generalization and bonuses , 2002, Neural Networks.

[87]  N. Uchida,et al.  Dopamine neurons share common response function for reward prediction error , 2016, Nature Neuroscience.

[88]  Liqun Luo,et al.  Circuit Architecture of VTA Dopamine Neurons Revealed by Systematic Input-Output Mapping , 2015, Cell.

[89]  J. J. Cone,et al.  Primary food reward and reward‐predictive stimuli evoke different patterns of phasic dopamine signaling throughout the striatum , 2011, The European journal of neuroscience.