Dopamine prediction error responses integrate subjective value from different reward dimensions

Significance Most real-world rewards have multiple dimensions, such as amount, risk, and type. Here we show that within a bounded set of such multidimensional rewards monkeys’ choice behavior fulfilled several core tenets of rational choice theory; namely, their choices were stochastically complete and transitive. As such, in selecting between rewards, the monkeys behaved as if they maximized value on a common scale. Dopamine neurons encoded prediction errors that reflected that scale. A particular reward dimension influenced dopamine activity only to the extent that it influenced choice. Thus, vastly different reward types such as juice and food activated dopamine neurons in accordance with subjective value derived from the different rewards. This neuronal signal could serve to update value signals for economic choice. Prediction error signals enable us to learn through experience. These experiences include economic choices between different rewards that vary along multiple dimensions. Therefore, an ideal way to reinforce economic choice is to encode a prediction error that reflects the subjective value integrated across these reward dimensions. Previous studies demonstrated that dopamine prediction error responses reflect the value of singular reward attributes that include magnitude, probability, and delay. Obviously, preferences between rewards that vary along one dimension are completely determined by the manipulated variable. However, it is unknown whether dopamine prediction error responses reflect the subjective value integrated from different reward dimensions. Here, we measured the preferences between rewards that varied along multiple dimensions, and as such could not be ranked according to objective metrics. Monkeys chose between rewards that differed in amount, risk, and type. Because their choices were complete and transitive, the monkeys chose “as if” they integrated different rewards and attributes into a common scale of value. The prediction error responses of single dopamine neurons reflected the integrated subjective value inferred from the choices, rather than the singular reward attributes. Specifically, amount, risk, and reward type modulated dopamine responses exactly to the extent that they influenced economic choices, even when rewards were vastly different, such as liquid and food. This prediction error response could provide a direct updating signal for economic values.

[1]  J. Marschak,et al.  Experimental Tests of Stochastic Decision Theory , 1957 .

[2]  M. M. Taylor,et al.  PEST: Efficient Estimates on Probability Functions , 1967 .

[3]  E. Rolls,et al.  Effects of hunger on the responses of neurons in the lateral hypothalamus to the sight and taste of food , 1976, Experimental Neurology.

[4]  R. Duncan Luce,et al.  Individual Choice Behavior: A Theoretical Analysis , 1979 .

[5]  K. Nakamura,et al.  Hypothalamic neuron involvement in integration of reward, aversion, and cue signals. , 1986, Journal of neurophysiology.

[6]  F. Gonon Nonlinear relationship between impulse flow and dopamine released by rat midbrain dopaminergic neurons as studied by in vivo electrochemistry , 1988, Neuroscience.

[7]  P. Goldman-Rakic,et al.  Characterization of the dopaminergic innervation of the primate frontal cortex using a dopamine-specific antibody. , 1993, Cerebral cortex.

[8]  W. Schultz,et al.  Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[9]  S. N. Haber,et al.  The organization of midbrain projections to the ventral striatum in the primate , 1994, Neuroscience.

[10]  P. Goldman-Rakic,et al.  Modulation of memory fields by dopamine Dl receptors in prefrontal cortex , 1995, Nature.

[11]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[12]  J. Price,et al.  Prefrontal cortical projections to the hypothalamus in Macaque monkeys , 1998, The Journal of comparative neurology.

[13]  R. Luce Utility of Gains and Losses: Measurement-Theoretical and Experimental Approaches , 2000 .

[14]  J. Wickens,et al.  A cellular mechanism of reward-related learning , 2001, Nature.

[15]  A. Sampson,et al.  Dopamine transporter immunoreactivity in monkey cerebral cortex: Regional, laminar, and ultrastructural localization , 2001, The Journal of comparative neurology.

[16]  P. Montague,et al.  Neural Economics and the Biological Substrates of Valuation , 2002, Neuron.

[17]  W. Schultz,et al.  Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons , 2003, Science.

[18]  O. Hikosaka,et al.  Dopamine Neurons Can Represent Context-Dependent Prediction Error , 2004, Neuron.

[19]  W. Newsome,et al.  Matching Behavior and the Representation of Value in the Parietal Cortex , 2004, Science.

[20]  L. Green,et al.  A discounting framework for choice with delayed and probabilistic rewards. , 2004, Psychological bulletin.

[21]  K. Doya,et al.  Representation of Action-Specific Reward Values in the Striatum , 2005, Science.

[22]  W. Schultz,et al.  Adaptive Coding of Reward Value by Dopamine Neurons , 2005, Science.

[23]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[24]  P. Glimcher,et al.  Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[25]  C. Padoa-Schioppa,et al.  Neurons in the orbitofrontal cortex encode economic value , 2006, Nature.

[26]  Graham V. Williams,et al.  Inverted-U dopamine D1 receptor actions on prefrontal neurons engaged in working memory , 2007, Nature Neuroscience.

[27]  Keiji Tanaka,et al.  Medial prefrontal cell activity signaling prediction errors of action values , 2007, Nature Neuroscience.

[28]  Colin Camerer,et al.  A framework for studying the neurobiology of value-based decision making , 2008, Nature Reviews Neuroscience.

[29]  Bijan Pesaran,et al.  Free choice activates a decision circuit between frontal and parietal cortex , 2008, Nature.

[30]  W. Schultz,et al.  Influence of Reward Delays on Responses of Dopamine Neurons , 2008, The Journal of Neuroscience.

[31]  P. Greengard,et al.  Dichotomous Dopaminergic Control of Striatal Synaptic Plasticity , 2008, Science.

[32]  O. Hikosaka,et al.  Two types of dopamine neuron distinctly convey positive and negative motivational signals , 2009, Nature.

[33]  K. Deisseroth,et al.  Phasic Firing in Dopaminergic Neurons Is Sufficient for Behavioral Conditioning , 2009, Science.

[34]  Ethan S. Bromberg-Martin,et al.  Multiple Timescales of Memory in Lateral Habenula and Dopamine Neurons , 2010, Neuron.

[35]  Anatol C. Kreitzer,et al.  Regulation of parkinsonian motor behaviours by optogenetic control of basal ganglia circuitry , 2010, Nature.

[36]  C. Padoa-Schioppa Neurobiology of economic choice: a good-based model. , 2011, Annual review of neuroscience.

[37]  P. Glimcher,et al.  Reward Value-Based Gain Control: Divisive Normalization in Parietal Cortex , 2011, The Journal of Neuroscience.

[38]  Timothy E. J. Behrens,et al.  Double dissociation of value computations in orbitofrontal and anterior cingulate neurons , 2011, Nature Neuroscience.

[39]  Simon Hong,et al.  Negative Reward Signals from the Lateral Habenula to Dopamine Neurons Are Mediated by Rostromedial Tegmental Nucleus in Primates , 2011, The Journal of Neuroscience.

[40]  Anne E Carpenter,et al.  Neuron-type specific signals for reward and punishment in the ventral tegmental area , 2011, Nature.

[41]  G. Stuber,et al.  Activation of VTA GABA Neurons Disrupts Reward Consumption , 2012, Neuron.

[42]  Kelly R. Tan,et al.  GABA Neurons of the VTA Drive Conditioned Place Aversion , 2012, Neuron.

[43]  L. Wilbrecht,et al.  Transient stimulation of distinct subpopulations of striatal neurons mimics changes in action value , 2012, Nature Neuroscience.

[44]  E. Miller,et al.  The Role of Prefrontal Dopamine D1 Receptors in the Neural Mechanisms of Associative Learning , 2012, Neuron.

[45]  Sachie K. Ogawa,et al.  Whole-Brain Mapping of Direct Inputs to Midbrain Dopamine Neurons , 2012, Neuron.

[46]  Christopher J. Peck,et al.  The primate amygdala combines information about space and value , 2013, Nature Neuroscience.

[47]  Josiah R. Boivin,et al.  A Causal Link Between Prediction Errors, Dopamine Neurons and Learning , 2013, Nature Neuroscience.

[48]  Henry H. Yin,et al.  Bidirectional Modulation of Substantia Nigra Activity by Motivational State , 2013, PloS one.