Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops

Evaluation of both immediate and future outcomes of one's actions is a critical requirement for intelligent behavior. Using functional magnetic resonance imaging (fMRI), we investigated brain mechanisms for reward prediction at different time scales in a Markov decision task. When human subjects learned actions on the basis of immediate rewards, significant activity was seen in the lateral orbitofrontal cortex and the striatum. When subjects learned to act in order to obtain large future rewards while incurring small immediate losses, the dorsolateral prefrontal cortex, inferior parietal cortex, dorsal raphe nucleus and cerebellum were also activated. Computational model–based regression analysis using the predicted future rewards and prediction errors estimated from subjects' performance data revealed graded maps of time scale within the insula and the striatum: ventroanterior regions were involved in predicting immediate rewards and dorsoposterior regions were involved in predicting future rewards. These results suggest differential involvement of the cortico-basal ganglia loops in reward prediction at different time scales.

[1]  M. Mesulam,et al.  Insula of the old world monkey. III: Efferent cortical output and comments on function , 1982, The Journal of comparative neurology.

[2]  Karl J. Friston,et al.  Statistical parametric maps in functional imaging: A general linear approach , 1994 .

[3]  Joel L. Davis,et al.  Adaptive Critics and the Basal Ganglia , 1995 .

[4]  E. Lynd-Balta,et al.  The orbital and medial prefrontal circuit through the primate basal ganglia , 1995, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[5]  R. Dolan,et al.  Neural systems engaged by planning: a PET study of the Tower of London task , 1996, Neuropsychologia.

[6]  Alan C. Evans,et al.  Planning and Spatial Working Memory: a Positron Emission Tomography Study in Humans , 1996, The European journal of neuroscience.

[7]  J. Evenden,et al.  The pharmacology of impulsive behaviour in rats: the effects of drugs on response choice with varying delays of reinforcement , 1996, Psychopharmacology.

[8]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[9]  H Burton,et al.  Multiple foci in parietal and frontal cortex activated by rubbing embossed grating patterns across fingerpads: a positron emission tomography study in humans. , 1997, Cerebral cortex.

[10]  Suzanne N. Haber,et al.  Insular Cortical Projections to Functional Regions of the Striatum Correlate with Cortical Cytoarchitectonic Organization in the Primate , 1997, The Journal of Neuroscience.

[11]  H. Groenewegen,et al.  Regional and cellular distribution of serotonin 5‐hydroxytryptamine2a receptor mRNA in the nucleus accumbens, olfactory tubercle, and caudate putamen of the rat , 1997, The Journal of comparative neurology.

[12]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[13]  D. Brooks,et al.  Evidence for striatal dopamine release during a video game , 1998, Nature.

[14]  M. Buhot,et al.  Selective increases in serotonin 5-HT1B/1D and 5-HT2A/2C binding sites in adult rat basal ganglia following lesions of serotonergic neurons , 1998, Brain Research.

[15]  T. Robbins,et al.  Effects of regional striatal lesions on motor, motivational, and executive aspects of progressive-ratio performance in rats. , 1999, Behavioral neuroscience.

[16]  T. Robbins,et al.  Choosing between Small, Likely Rewards and Large, Unlikely Rewards Activates Inferior and Orbital Prefrontal Cortex , 1999, The Journal of Neuroscience.

[17]  T. Robbins,et al.  Dissociable Deficits in the Decision-Making Cognition of Chronic Amphetamine Abusers, Opiate Abusers, Patients with Focal Damage to Prefrontal Cortex, and Tryptophan-Depleted Normal Volunteers: Evidence for Monoaminergic Mechanisms , 1999, Neuropsychopharmacology.

[18]  K. Doya,et al.  Parallel neural networks for learning sequential procedures , 1999, Trends in Neurosciences.

[19]  P. Strick,et al.  Basal ganglia and cerebellar loops: motor and cognitive circuits , 2000, Brain Research Reviews.

[20]  E. Rolls The orbitofrontal cortex and reward. , 2000, Cerebral cortex.

[21]  C. Cavada,et al.  The anatomical connections of the macaque monkey orbitofrontal cortex. A review. , 2000, Cerebral cortex.

[22]  B. Balleine,et al.  The Effect of Lesions of the Insular Cortex on Instrumental Conditioning: Evidence for a Role in Incentive Memory , 2000, The Journal of Neuroscience.

[23]  A. Damasio,et al.  Emotion, decision making and the orbitofrontal cortex. , 2000, Cerebral cortex.

[24]  W. Schultz,et al.  Reward-related neuronal activity during go-nogo task performance in primate orbitofrontal cortex. , 2000, Journal of neurophysiology.

[25]  S. Mobini,et al.  Effects of central 5-hydroxytryptamine depletion on sensitivity to delayed and probabilistic reinforcement , 2000, Psychopharmacology.

[26]  K. Doya Complementary roles of basal ganglia and cerebellum in learning and motor control , 2000, Current Opinion in Neurobiology.

[27]  Karl J. Friston,et al.  Dissociable Neural Responses in Human Reward Systems , 2000, The Journal of Neuroscience.

[28]  K. Hikosaka,et al.  Delay activity of orbital and lateral prefrontal neurons of the monkey varying with different rewards. , 2000, Cerebral cortex.

[29]  Samuel M. McClure,et al.  Predictability Modulates Human Brain Response to Reward , 2001, The Journal of Neuroscience.

[30]  D. Kahneman,et al.  Functional Imaging of Neural Responses to Expectancy and Experience of Monetary Gains and Losses tasks with monetary payoffs , 2001 .

[31]  T. Robbins,et al.  Impulsive Choice Induced in Rats by Lesions of the Nucleus Accumbens Core , 2001, Science.

[32]  P. Celada,et al.  Control of Dorsal Raphe Serotonergic Neurons by the Medial Prefrontal Cortex: Involvement of Serotonin-1A, GABAA, and Glutamate Receptors , 2001, The Journal of Neuroscience.

[33]  B. Roth,et al.  Control of Serotonergic Function in Medial Prefrontal Cortex by Serotonin-2A Receptors through a Glutamate-Dependent Mechanism , 2001, The Journal of Neuroscience.

[34]  H. Critchley,et al.  Neural Activity in the Human Brain Relating to Uncertainty and Arousal during Anticipation , 2001, Neuron.

[35]  Brian Knutson,et al.  Anticipation of Increasing Monetary Reward Selectively Recruits Nucleus Accumbens , 2001, The Journal of Neuroscience.

[36]  J. O'Doherty,et al.  Neural Responses during Anticipation of a Primary Taste Reward , 2002, Neuron.

[37]  Kenji Doya,et al.  Metalearning and neuromodulation , 2002, Neural Networks.

[38]  J. Deakin,et al.  Effects of lesions of the orbitofrontal cortex on sensitivity to delayed and probabilistic reinforcement , 2002, Psychopharmacology.

[39]  P. Montague,et al.  Activity in human ventral striatum locked to errors of reward prediction , 2002, Nature Neuroscience.

[40]  B. Richmond,et al.  Anterior Cingulate: Single Neuronal Signals Related to Degree of Reward Expectancy , 2002, Science.

[41]  M. Honda,et al.  The role of rostral Brodmann area 6 in mental-operation tasks: an integrative neuroimaging approach. , 2002, Cerebral cortex.

[42]  John N. J. Reynolds,et al.  Dopamine-dependent plasticity of corticostriatal synapses , 2002, Neural Networks.

[43]  Brian Knutson,et al.  A region of mesial prefrontal cortex tracks monetarily rewarding outcomes: characterization with rapid event-related fMRI , 2003, NeuroImage.

[44]  D. V. von Cramon,et al.  Error Monitoring Using External Feedback: Specific Roles of the Habenular Complex, the Reward System, and the Cingulate Motor Area Revealed by Functional Magnetic Resonance Imaging , 2003, The Journal of Neuroscience.

[45]  Samuel M. McClure,et al.  Temporal Prediction Errors in a Passive Learning Task Activate Human Striatum , 2003, Neuron.

[46]  R. Elliott,et al.  Differential Response Patterns in the Striatum and Orbitofrontal Cortex to Financial Reward in Humans: A Parametric Functional Magnetic Resonance Imaging Study , 2003, The Journal of Neuroscience.

[47]  Karl J. Friston,et al.  Temporal Difference Models and Reward-Related Learning in the Human Brain , 2003, Neuron.

[48]  Keiji Tanaka,et al.  Neuronal Correlates of Goal-Based Motor Selection in the Prefrontal Cortex , 2003, Science.

[49]  J. O'Doherty,et al.  Dissociating Valence of Outcome from Behavioral Control in Human Orbital and Ventral Prefrontal Cortices , 2003, The Journal of Neuroscience.

[50]  B. Everitt,et al.  Lesions of the Orbitofrontal but not Medial Prefrontal Cortex Disrupt Conditioned Reinforcement in Primates , 2003, The Journal of Neuroscience.

[51]  K. Doya,et al.  A Neural Correlate of Reward-Based Behavioral Learning in Caudate Nucleus: A Functional Magnetic Resonance Imaging Study of a Stochastic Decision Task , 2004, The Journal of Neuroscience.

[52]  Richard S. Sutton,et al.  Reinforcement Learning , 1992, Handbook of Machine Learning.