The cost of obtaining rewards enhances the reward prediction error signal of midbrain dopamine neurons

Midbrain dopamine neurons are known to encode reward prediction errors (RPE) used to update value predictions. Here, we examine whether RPE signals coded by midbrain dopamine neurons are modulated by the cost paid to obtain rewards, by recording from dopamine neurons in awake behaving monkeys during performance of an effortful saccade task. Dopamine neuron responses to cues predicting reward and to the delivery of rewards were increased after the performance of a costly action compared to a less costly action, suggesting that RPEs are enhanced following the performance of a costly action. At the behavioral level, stimulus-reward associations are learned faster after performing a costly action compared to a less costly action. Thus, information about action cost is processed in the dopamine reward system in a manner that amplifies the following dopamine RPE signal, which in turn promotes more rapid learning under situations of high cost. Rewards that require high effort tend to be preferred over those that require low effort. Here, the authors show how the effort of obtaining rewards affects reward-related activity of dopamine neurons, and in turn the speed of learning stimulus-reward associations.

[1]  R. Turner,et al.  Limited Encoding of Effort by Dopamine Neurons in a Cost–Benefit Trade-off Task , 2013, The Journal of Neuroscience.

[2]  Sachie K. Ogawa,et al.  Whole-Brain Mapping of Direct Inputs to Midbrain Dopamine Neurons , 2012, Neuron.

[3]  M. Roesch,et al.  Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards , 2007, Nature Neuroscience.

[4]  Adam Kepecs,et al.  Midbrain Dopamine Neurons Signal Belief in Choice Accuracy during a Perceptual Decision , 2017, Current Biology.

[5]  M. Haruno,et al.  Reward Prediction Error Signal Enhanced by Striatum–Amygdala Interaction Explains the Acceleration of Probabilistic Reward Learning by Emotion , 2013, The Journal of Neuroscience.

[6]  William R. Stauffer,et al.  Dopamine prediction error responses integrate subjective value from different reward dimensions , 2014, Proceedings of the National Academy of Sciences.

[7]  M. McDevitt,et al.  Inhibition and Superconditioning , 2002, Psychological science.

[8]  Wolfram Schultz,et al.  Phasic dopamine signals: from subjective reward value to formal economic utility , 2015, Current Opinion in Behavioral Sciences.

[9]  W. Schultz,et al.  Adaptive Coding of Reward Value by Dopamine Neurons , 2005, Science.

[10]  S. Haber,et al.  Reward-Related Cortical Inputs Define a Large Striatal Region in Primates That Interface with Associative Cortical Connections, Providing a Substrate for Incentive-Based Learning , 2006, The Journal of Neuroscience.

[11]  William R. Stauffer,et al.  Dopamine Reward Prediction Error Responses Reflect Marginal Utility , 2014, Current Biology.

[12]  X. Zhuang,et al.  Faculty Opinions recommendation of A selective role for dopamine in stimulus-reward learning. , 2010 .

[13]  K. Doya Modulators of decision making , 2008, Nature Neuroscience.

[14]  Ethan S. Bromberg-Martin,et al.  Dopamine in Motivational Control: Rewarding, Aversive, and Alerting , 2010, Neuron.

[15]  L. Festinger,et al.  A Theory of Cognitive Dissonance , 2017 .

[16]  Masayuki Matsumoto,et al.  Distinct Representations of Cognitive and Motivational Signals in Midbrain Dopamine Neurons , 2013, Neuron.

[17]  J. O'Doherty,et al.  Is Avoiding an Aversive Outcome Rewarding? Neural Substrates of Avoidance Learning in the Human Brain , 2006, PLoS biology.

[18]  P. Glimcher,et al.  Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[19]  O. Hikosaka,et al.  Dopamine Neurons Can Represent Context-Dependent Prediction Error , 2004, Neuron.

[20]  Karl J. Friston,et al.  Opponent appetitive-aversive neural processes underlie predictive learning of pain relief , 2005, Nature Neuroscience.

[21]  Timothy E. J. Behrens,et al.  Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[22]  O. Hikosaka,et al.  A possible role of midbrain dopamine neurons in short- and long-term adaptation of saccades to position-reward mapping. , 2004, Journal of neurophysiology.

[23]  D. Blough Effects of priming, discriminability, and reinforcement on reaction-time components of pigeon visual search. , 2000, Journal of experimental psychology. Animal behavior processes.

[24]  Thomas R Zentall,et al.  Contrast and the justification of effort , 2005, Psychonomic bulletin & review.

[25]  Takeo Watanabe,et al.  Temporally Extended Dopamine Responses to Perceptually Demanding Reward-Predictive Stimuli , 2010, The Journal of Neuroscience.

[26]  T. Zentall,et al.  “work ethic” in pigeons: Reward value is directly related to the effort or time required to obtain the reward , 2000, Psychonomic bulletin & review.

[27]  R. Bogacz,et al.  Action Initiation Shapes Mesolimbic Dopamine Encoding of Future Rewards , 2015, Nature Neuroscience.

[28]  B. Everitt,et al.  Differential Involvement of NMDA, AMPA/Kainate, and Dopamine Receptors in the Nucleus Accumbens Core in the Acquisition and Performance of Pavlovian Approach Behavior , 2001, The Journal of Neuroscience.

[29]  O. Hikosaka,et al.  Two types of dopamine neuron distinctly convey positive and negative motivational signals , 2009, Nature.

[30]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[31]  S. Bouret,et al.  Noradrenaline and Dopamine Neurons in the Reward/Effort Trade-Off: A Direct Electrophysiological Comparison in Behaving Monkeys , 2015, The Journal of Neuroscience.

[32]  T. Zentall,et al.  Within-trial contrast: pigeons prefer conditioned reinforcers that follow a relatively more rather than a less aversive event. , 2007, Journal of the experimental analysis of behavior.

[33]  榎本 一紀 Dopamine neurons learn to encode the long-term value of multiple future rewards , 2011 .

[34]  Wolfram Schultz,et al.  Behavioral reactions reflecting differential reward expectations in monkeys , 2001, Experimental Brain Research.

[35]  W. Schultz,et al.  Influence of Reward Delays on Responses of Dopamine Neurons , 2008, The Journal of Neuroscience.

[36]  Matthew A. J. Apps,et al.  Commentary: Noradrenaline and Dopamine Neurons in the Reward/Effort Trade-off: A Direct Electrophysiological Comparison in Behaving Monkeys , 2015, Front. Behav. Neurosci..

[37]  Masaki Isoda,et al.  Social reward monitoring and valuation in the macaque brain , 2018, Nature Neuroscience.

[38]  Elliot Aronson,et al.  The effect of severity of initiation on liking for a group. , 1959 .

[39]  Jean-Claude Darcheville,et al.  Preference for rewards that follow greater effort and greater delay , 2008, Learning & behavior.

[40]  Saori C. Tanaka,et al.  Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops , 2004, Nature Neuroscience.