Correlated Coding of Motivation and Outcome of Decision by Dopamine Neurons

We recorded the activity of midbrain dopamine neurons in an instrumental conditioning task in which monkeys made a series of behavioral decisions on the basis of distinct reward expectations. Dopamine neurons responded to the first visual cue that appeared in each trial [conditioned stimulus (CS)] through which monkeys initiated trial for decision while expecting trial-specific reward probability and volume. The magnitude of neuronal responses to the CS was approximately proportional to reward expectations but with considerable discrepancy. In contrast, CS responses appear to represent motivational properties, because their magnitude at trials with identical reward expectation had significant negative correlation with reaction times of the animal after the CS. Dopamine neurons also responded to reinforcers that occurred after behavioral decisions, and the responses precisely encoded positive and negative reward expectation errors (REEs). The gain of coding REEs by spike frequency increased during learning act-outcome contingencies through a few months of task training, whereas coding of motivational properties remained consistent during the learning. We found that the magnitude of CS responses was positively correlated with that of reinforcers, suggesting a modulation of the effectiveness of REEs as a teaching signal by motivation. For instance, rate of learning could be faster when animals are motivated, whereas it could be slower when less motivated, even at identical REEs. Therefore, the dual correlated coding of motivation and REEs suggested the involvement of the dopamine system, both in reinforcement in more elaborate ways than currently proposed and in motivational function in reward-based decision-making and learning.

[1]  J. Stevens,et al.  Animal Intelligence , 1883, Nature.

[2]  C. L. Hull Principles of behavior : an introduction to behavior theory , 1943 .

[3]  James L Olds,et al.  Positive reinforcement produced by electrical stimulation of septal area and other regions of rat brain. , 1954, Journal of comparative and physiological psychology.

[4]  Antoine Arnauld,et al.  The Art of Thinking: Port-Royal Logic , 1966 .

[5]  J. Konorski Integrative activity of the brain , 1967 .

[6]  R. Bolles Reinforcement, expectancy, and learning. , 1972 .

[7]  D. Bindra How adaptive behavior is produced: a perceptual-motivational alternative to response reinforcements , 1978, Behavioral and Brain Sciences.

[8]  A. Dickinson Contemporary Animal Learning Theory , 1981 .

[9]  A. Grace,et al.  Intracellular and extracellular electrophysiology of nigral dopaminergic neurons—1. Identification and characterization , 1983, Neuroscience.

[10]  M. Kimura The role of primate putamen neurons in the association of sensory stimuli with movement , 1986, Neuroscience Research.

[11]  B. Jacobs Single Unit Activity of Brain Monoamine‐Containing Neurons in Freely Moving Animals a , 1986, Annals of the New York Academy of Sciences.

[12]  W. Schultz Responses of midbrain dopamine neurons to behavioral trigger stimuli in the monkey. , 1986, Journal of neurophysiology.

[13]  W. Schultz,et al.  Responses of nigrostriatal dopamine neurons to high-intensity somatosensory stimulation in the anesthetized monkey. , 1987, Journal of neurophysiology.

[14]  W. Schultz,et al.  Responses of monkey dopamine neurons during learning of behavioral reactions. , 1992, Journal of neurophysiology.

[15]  W. Schultz,et al.  Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[16]  B. Balleine,et al.  Motivational control of goal-directed action , 1994 .

[17]  Joel L. Davis,et al.  A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .

[18]  W. Schultz,et al.  Importance of unpredictability for reward responses in primate dopamine neurons. , 1994, Journal of neurophysiology.

[19]  A. Graybiel,et al.  Effect of the nigrostriatal dopamine system on acquired neural responses in the striatum of behaving monkeys. , 1994, Science.

[20]  O. Hikosaka Models of information processing in the basal Ganglia edited by James C. Houk, Joel L. Davis and David G. Beiser, The MIT Press, 1995. $60.00 (400 pp) ISBN 0 262 08234 9 , 1995, Trends in Neurosciences.

[21]  A. Barto,et al.  Adaptive Critics and the Basal Ganglia , 1994 .

[22]  S. T. Kitai,et al.  Glutamatergic and cholinergic inputs from the pedunculopontine tegmental nucleus to dopamine neurons in the substantia nigra pars compacta , 1995, Neuroscience Research.

[23]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[24]  T. Robbins,et al.  Neurobehavioural mechanisms of reward and motivation , 1996, Current Opinion in Neurobiology.

[25]  J. Horvitz,et al.  Burst activity of ventral tegmental dopamine neurons is elicited by sensory stimuli in the awake cat , 1997, Brain Research.

[26]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[27]  B. Richmond,et al.  Neuronal Signals in the Monkey Ventral Striatum Related to Progress through a Predictable Series of Trials , 1998, The Journal of Neuroscience.

[28]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[29]  W. Schultz,et al.  Learning of sequential movements by neural network model with dopamine-like reinforcement signal , 1998, Experimental Brain Research.

[30]  D. Brooks,et al.  Evidence for striatal dopamine release during a video game , 1998, Nature.

[31]  H. Condé,et al.  The role of the pedunculopontine tegmental nucleus in relation to conditioned motor performance in the cat I. Context-dependent and reinforcement-related single unit activity , 1998, Experimental Brain Research.

[32]  F. Weiss,et al.  The dopamine hypothesis of reward: past and current status , 1999, Trends in Neurosciences.

[33]  Joshua W. Brown,et al.  How the Basal Ganglia Use Parallel Excitatory and Inhibitory Learning Pathways to Selectively Respond to Unexpected Rewarding Cues , 1999, The Journal of Neuroscience.

[34]  P. Redgrave,et al.  Is the short-latency dopamine response too short to signal reward error? , 1999, Trends in Neurosciences.

[35]  A. Graybiel,et al.  Role of [corrected] nigrostriatal dopamine system in learning to perform sequential motor tasks in a predictive manner. , 1999, Journal of neurophysiology.

[36]  松本 直幸,et al.  Role of Nigrostriatal Dopamine System in Learning to Perform Sequential Motor Tasks in a Predictive Manner , 2000 .

[37]  S. Kakade,et al.  Learning and selective attention , 2000, Nature Neuroscience.

[38]  J. Horvitz Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events , 2000, Neuroscience.

[39]  O. Hikosaka,et al.  Modulation of saccadic eye movements by predicted reward outcome , 2001, Experimental Brain Research.

[40]  Wolfram Schultz,et al.  Behavioral reactions reflecting differential reward expectations in monkeys , 2001, Experimental Brain Research.

[41]  Yasushi Kobayashi,et al.  Contribution of pedunculopontine tegmental nucleus neurons to performance of visually guided saccade tasks in monkeys. , 2002, Journal of neurophysiology.

[42]  P. Dayan,et al.  Reward, Motivation, and Reinforcement Learning , 2002, Neuron.

[43]  M. Mauk,et al.  Inhibition of climbing fibres is a signal for the extinction of conditioned eyelid responses , 2002, Nature.

[44]  J. Salamone,et al.  Motivational views of reinforcement: implications for understanding the behavioral functions of nucleus accumbens dopamine , 2002, Behavioural Brain Research.

[45]  R. Wise Brain Reward Circuitry Insights from Unsensed Incentives , 2002, Neuron.

[46]  W. Schultz,et al.  Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons , 2003, Science.

[47]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.