Phasic Dopamine Release in the Rat Nucleus Accumbens Symmetrically Encodes a Reward Prediction Error Term

Making predictions about the rewards associated with environmental stimuli and updating those predictions through feedback is an essential aspect of adaptive behavior. Theorists have argued that dopamine encodes a reward prediction error (RPE) signal that is used in such a reinforcement learning process. Recent work with fMRI has demonstrated that the BOLD signal in dopaminergic target areas meets both necessary and sufficient conditions of an axiomatic model of the RPE hypothesis. However, there has been no direct evidence that dopamine release itself also meets necessary and sufficient criteria for encoding an RPE signal. Further, the fact that dopamine neurons have low tonic firing rates that yield a limited dynamic range for encoding negative RPEs has led to significant debate about whether positive and negative prediction errors are encoded on a similar scale. To address both of these issues, we used fast-scan cyclic voltammetry to measure reward-evoked dopamine release at carbon fiber electrodes chronically implanted in the nucleus accumbens core of rats trained on a probabilistic decision-making task. We demonstrate that dopamine concentrations transmit a bidirectional RPE signal with symmetrical encoding of positive and negative RPEs. Our findings strengthen the case that changes in dopamine concentration alone are sufficient to encode the full range of RPEs necessary for reinforcement learning.

[1]  G. Paxinos,et al.  The Rat Brain in Stereotaxic Coordinates , 1983 .

[2]  R. M. Wightman,et al.  Real-time characterization of dopamine overflow and uptake in the rat striatum , 1988, Neuroscience.

[3]  W. Schultz,et al.  Responses of monkey dopamine neurons during learning of behavioral reactions. , 1992, Journal of neurophysiology.

[4]  P. Garris,et al.  Efflux of dopamine from the synaptic cleft in the nucleus accumbens of the rat brain , 1994, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[5]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[6]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[7]  W. Schultz,et al.  Dopamine responses comply with basic assumptions of formal learning theory , 2001, Nature.

[8]  W. Schultz,et al.  Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons , 2003, Science.

[9]  P. Garris,et al.  Real‐time decoding of dopamine concentration changes in the caudate–putamen during tonic and phasic firing , 2004, Journal of neurochemistry.

[10]  P. Montague,et al.  Dynamic Gain Control of Dopamine Delivery in Freely Moving Animals , 2004, The Journal of Neuroscience.

[11]  P. Glimcher,et al.  Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[12]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[13]  M. Walton,et al.  Calculating utility: preclinical evidence for cost–benefit analysis by mesolimbic dopamine , 2007, Psychopharmacology.

[14]  E. Vaadia,et al.  Midbrain dopamine neurons encode decisions for future action , 2006, Nature Neuroscience.

[15]  P. Glimcher,et al.  Statistics of midbrain dopamine neuron spike trains in the awake primate. , 2007, Journal of neurophysiology.

[16]  Andrew Caplin,et al.  Dopamine, Reward Prediction Error, and Economics , 2008 .

[17]  R. Wightman,et al.  Multivariate concentration determination using principal component regression with residual analysis. , 2009, Trends in analytical chemistry : TRAC.

[18]  M. Walton,et al.  Dissociable cost and benefit encoding of future rewards by mesolimbic dopamine , 2009, Nature Neuroscience.

[19]  P. Glimcher,et al.  MEASURING BELIEFS AND REWARDS: A NEUROECONOMIC APPROACH. , 2010, The quarterly journal of economics.

[20]  Nephi Stella,et al.  Chronic microsensors for longitudinal, subsecond dopamine detection in behaving animals , 2009, Nature Methods.

[21]  P. Glimcher,et al.  Testing the Reward Prediction Error Hypothesis with an Axiomatic Model , 2010, The Journal of Neuroscience.

[22]  Rune W. Berg,et al.  Influence of Phasic and Tonic Dopamine Release on Receptor Activation , 2010, The Journal of Neuroscience.

[23]  P. Fletcher,et al.  Faculty Opinions recommendation of A selective role for dopamine in stimulus-reward learning. , 2011 .

[24]  T. Robinson,et al.  A selective role for dopamine in reward learning , 2010, Nature.

[25]  Terry Lohrenz,et al.  Sub-Second Dopamine Detection in Human Striatum , 2011, PloS one.

[26]  P. Glimcher Understanding dopamine and reinforcement learning: The dopamine reward prediction error hypothesis , 2011, Proceedings of the National Academy of Sciences.

[27]  Guillem R. Esber,et al.  Reconciling the influence of predictiveness and uncertainty on stimulus salience: a model of attention in associative learning , 2011, Proceedings of the Royal Society B: Biological Sciences.

[28]  A. Graybiel,et al.  Prolonged Dopamine Signalling in Striatum Signals Proximity and Value of Distant Rewards , 2013, Nature.

[29]  Josiah R. Boivin,et al.  A Causal Link Between Prediction Errors, Dopamine Neurons and Learning , 2013, Nature Neuroscience.