Dopamine signals for reward value and risk: basic and recent data

BackgroundPrevious lesion, electrical self-stimulation and drug addiction studies suggest that the midbrain dopamine systems are parts of the reward system of the brain. This review provides an updated overview about the basic signals of dopamine neurons to environmental stimuli.MethodsThe described experiments used standard behavioral and neurophysiological methods to record the activity of single dopamine neurons in awake monkeys during specific behavioral tasks.ResultsDopamine neurons show phasic activations to external stimuli. The signal reflects reward, physical salience, risk and punishment, in descending order of fractions of responding neurons. Expected reward value is a key decision variable for economic choices. The reward response codes reward value, probability and their summed product, expected value. The neurons code reward value as it differs from prediction, thus fulfilling the basic requirement for a bidirectional prediction error teaching signal postulated by learning theory. This response is scaled in units of standard deviation. By contrast, relatively few dopamine neurons show the phasic activation following punishers and conditioned aversive stimuli, suggesting a lack of relationship of the reward response to general attention and arousal. Large proportions of dopamine neurons are also activated by intense, physically salient stimuli. This response is enhanced when the stimuli are novel; it appears to be distinct from the reward value signal. Dopamine neurons show also unspecific activations to non-rewarding stimuli that are possibly due to generalization by similar stimuli and pseudoconditioning by primary rewards. These activations are shorter than reward responses and are often followed by depression of activity. A separate, slower dopamine signal informs about risk, another important decision variable. The prediction error response occurs only with reward; it is scaled by the risk of predicted reward.ConclusionsNeurophysiological studies reveal phasic dopamine signals that transmit information related predominantly but not exclusively to reward. Although not being entirely homogeneous, the dopamine signal is more restricted and stereotyped than neuronal activity in most other brain structures involved in goal directed behavior.

[1]  D. Bernoulli Exposition of a New Theory on the Measurement of Risk , 1954 .

[2]  D. Bernoulli Specimen theoriae novae de mensura sortis : translated into German and English , 1967 .

[3]  R. Rescorla Pavlovian conditioned inhibition , 1969 .

[4]  R. Rescorla,et al.  A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[5]  W. F. Prokasy,et al.  Classical conditioning II: Current research and theory. , 1972 .

[6]  P. J. Sheafor "Pseudoconditioned" jaw movements of the rabbit reflect associations conditioned to contextual background cues. , 1975, Journal of experimental psychology. Animal behavior processes.

[7]  G. Ainslie Specious reward: a behavioral theory of impulsiveness and impulse control. , 1975, Psychological bulletin.

[8]  N. Mackintosh A Theory of Attention: Variations in the Associability of Stimuli with Reinforcement , 1975 .

[9]  J. Pearce,et al.  A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. , 1980, Psychological review.

[10]  Louis A. Chiodo,et al.  Sensory stimuli alter discharge rate of dopamine (DA) neurons: evidence for two functional types of DA cells in the substantia nigra , 1980, Brain Research.

[11]  J. Pearce,et al.  A model for Pavlovian learning: Variations in the effectiveness of conditioned but not of unconditioned stimuli. , 1980 .

[12]  A G Barto,et al.  Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.

[13]  B. Jacobs,et al.  Substantia nigra dopaminergic unit activity in behaving cats: Effect of arousal on spontaneous discharge and sensory evoked activity , 1985, Brain Research.

[14]  W. Schultz,et al.  Responses of nigrostriatal dopamine neurons to high-intensity somatosensory stimulation in the anesthetized monkey. , 1987, Journal of neurophysiology.

[15]  A. Logue,et al.  Adjusting delay to reinforcement: comparing choice in pigeons and humans. , 1988, Journal of experimental psychology. Animal behavior processes.

[16]  R. Wise,et al.  Brain dopamine and reward. , 1989, Annual review of psychology.

[17]  J. Glowinski,et al.  Effect of noxious tail pinch on the discharge rate of mesocortical and mesolimbic dopamine neurons: selective activation of the mesocortical system , 1989, Brain Research.

[18]  E. Richfield,et al.  Anatomical and affinity state comparisons between dopamine D1 and D2 receptors in the rat central nervous system , 1989, Neuroscience.

[19]  W. Schultz,et al.  Dopamine neurons of the monkey midbrain: contingencies of responses to active touch during self-initiated arm movements. , 1990, Journal of neurophysiology.

[20]  W. Schultz,et al.  Dopamine neurons of the monkey midbrain: contingencies of responses to stimuli eliciting immediate behavioral reactions. , 1990, Journal of neurophysiology.

[21]  W. Schultz,et al.  Responses of monkey midbrain dopamine neurons during delayed alternation performance , 1991, Brain Research.

[22]  W. Schultz,et al.  Responses of monkey dopamine neurons during learning of behavioral reactions. , 1992, Journal of neurophysiology.

[23]  W. Schultz,et al.  Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[24]  W. Schultz,et al.  Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli , 1996, Nature.

[25]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[26]  H. de Wit,et al.  Determination of discount functions in rats with an adjusting-amount procedure. , 1997, Journal of the experimental analysis of behavior.

[27]  J. Horvitz,et al.  Burst activity of ventral tegmental dopamine neurons is elicited by sensory stimuli in the awake cat , 1997, Brain Research.

[28]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[29]  J. Hollerman,et al.  Dopamine neurons report an error in the temporal prediction of reward during learning , 1998, Nature Neuroscience.

[30]  F. Guarraci,et al.  An electrophysiological characterization of ventral tegmental area dopaminergic neurons during differential pavlovian fear conditioning in the awake rabbit , 1999, Behavioural Brain Research.

[31]  W. Schultz,et al.  A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task , 1999, Neuroscience.

[32]  A. Borst Seeing smells: imaging olfactory learning in bees , 1999, Nature Neuroscience.

[33]  W. Schultz,et al.  Dopamine responses comply with basic assumptions of formal learning theory , 2001, Nature.

[34]  Peter Dayan,et al.  Dopamine: generalization and bonuses , 2002, Neural Networks.

[35]  W. Schultz,et al.  Coding of Predicted Reward Omission by Dopamine Neurons in a Conditioned Inhibition Paradigm , 2003, The Journal of Neuroscience.

[36]  W. Schultz,et al.  Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons , 2003, Science.

[37]  Tatsuo K Sato,et al.  Correlated Coding of Motivation and Outcome of Decision by Dopamine Neurons , 2003, The Journal of Neuroscience.

[38]  O. Hikosaka,et al.  Dopamine Neurons Can Represent Context-Dependent Prediction Error , 2004, Neuron.

[39]  O. Hikosaka,et al.  A possible role of midbrain dopamine neurons in short- and long-term adaptation of saccades to position-reward mapping. , 2004, Journal of neurophysiology.

[40]  Andrew M. J. Young,et al.  Increased extracellular dopamine in nucleus accumbens in response to unconditioned and conditioned aversive stimuli: studies using 1 min microdialysis in rats , 2004, Journal of Neuroscience Methods.

[41]  E. Vaadia,et al.  Coincident but Distinct Messages of Midbrain Dopamine and Striatal Tonically Active Neurons , 2004, Neuron.

[42]  W. Pan,et al.  Dopamine Cells Respond to Predicted Events during Classical Conditioning: Evidence for Eligibility Traces in the Reward-Learning Network , 2005, The Journal of Neuroscience.

[43]  W. Schultz,et al.  Adaptive Coding of Reward Value by Dopamine Neurons , 2005, Science.

[44]  T. Robbins,et al.  Neural systems of reinforcement for drug addiction: from actions to habits to compulsion , 2005, Nature Neuroscience.

[45]  A. Bonci,et al.  Synaptic plasticity and drug addiction. , 2005, Current opinion in pharmacology.

[46]  P. Glimcher,et al.  Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[47]  P. Redgrave,et al.  Nociceptive responses of midbrain dopaminergic neurones are modulated by the superior colliculus in the rat , 2006, Neuroscience.

[48]  E. Vaadia,et al.  Midbrain dopamine neurons encode decisions for future action , 2006, Nature Neuroscience.

[49]  M. Bevan,et al.  Synaptic activation of dendritic AMPA and NMDA receptors generates transient high-frequency firing in substantia nigra dopamine neurons in vitro. , 2007, Journal of neurophysiology.

[50]  R. Malenka,et al.  Synaptic plasticity and addiction , 2007, Nature Reviews Neuroscience.

[51]  P. Glimcher,et al.  Statistics of midbrain dopamine neuron spike trains in the awake primate. , 2007, Journal of neurophysiology.

[52]  K. Preuschoff,et al.  Adding Prediction Risk to the Theory of Reward Learning , 2007, Annals of the New York Academy of Sciences.

[53]  W. Schultz Multiple dopamine functions at different time courses. , 2007, Annual review of neuroscience.

[54]  M. Roesch,et al.  Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards , 2007, Nature Neuroscience.

[55]  R. Wightman,et al.  Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens , 2007, Nature Neuroscience.

[56]  W. Newsome,et al.  The temporal precision of reward prediction in dopamine neurons , 2008, Nature Neuroscience.

[57]  E. Vaadia,et al.  Midbrain Dopaminergic Neurons and Striatal Cholinergic Interneurons Encode the Difference between Reward and Aversive Events at Different Epochs of Probabilistic Classical Conditioning Trials , 2008, The Journal of Neuroscience.

[58]  R. Wightman,et al.  Real-time chemical responses in the nucleus accumbens differentiate rewarding and aversive stimuli , 2008, Nature Neuroscience.

[59]  W. Schultz,et al.  Influence of Reward Delays on Responses of Dopamine Neurons , 2008, The Journal of Neuroscience.

[60]  R. Palmiter,et al.  Disruption of NMDAR-dependent burst firing by dopamine neurons provides selective assessment of phasic dopamine-dependent behavior , 2009, Proceedings of the National Academy of Sciences.

[61]  O. Hikosaka,et al.  Two types of dopamine neuron distinctly convey positive and negative motivational signals , 2009, Nature.

[62]  M. Ungless,et al.  Phasic excitation of dopamine neurons in ventral VTA by noxious stimuli , 2009, Proceedings of the National Academy of Sciences.

[63]  Mark T. Harnett,et al.  Burst-Timing-Dependent Plasticity of NMDA Receptor-Mediated Transmission in Midbrain Dopamine Neurons , 2009, Neuron.

[64]  M. Kahana,et al.  Human Substantia Nigra Neurons Encode Unexpected Financial Rewards , 2009, Science.

[65]  J. Bolam,et al.  Activity of Neurochemically Heterogeneous Dopaminergic Neurons in the Substantia Nigra during Spontaneous and Driven Changes in Brain State , 2009, The Journal of Neuroscience.

[66]  K. Deisseroth,et al.  Phasic Firing in Dopaminergic Neurons Is Sufficient for Behavioral Conditioning , 2009, Science.

[67]  A. Cooper,et al.  Predictive Reward Signal of Dopamine Neurons , 2011 .