Determining the Neural Substrates of Goal-Directed Learning in the Human Brain

Instrumental conditioning is considered to involve at least two distinct learning systems: a goal-directed system that learns associations between responses and the incentive value of outcomes, and a habit system that learns associations between stimuli and responses without any link to the outcome that that response engendered. Lesion studies in rodents suggest that these two distinct components of instrumental conditioning may be mediated by anatomically distinct neural systems. The aim of the present study was to determine the neural substrates of the goal-directed component of instrumental learning in humans. Nineteen human subjects were scanned with functional magnetic resonance imaging while they learned to choose instrumental actions that were associated with the subsequent delivery of different food rewards (tomato juice, chocolate milk, and orange juice). After training, one of these foods was devalued by feeding the subject to satiety on that food. The subjects were then scanned again, while being re-exposed to the instrumental choice procedure (in extinction). We hypothesized that regions of the brain involved in goal-directed learning would show changes in their activity as a function of outcome devaluation. Our results indicate that neural activity in one brain region in particular, the orbitofrontal cortex, showed a strong modulation in its activity during selection of a devalued compared with a nondevalued action. These results suggest an important contribution of orbitofrontal cortex in guiding goal-directed instrumental choices in humans.

[1]  Christopher D. Adams Variations in the Sensitivity of Instrumental Responding to Reinforcer Devaluation , 1982 .

[2]  D. Garner,et al.  The Eating Attitudes Test: psychometric features and clinical correlates , 1982, Psychological Medicine.

[3]  A. Dickinson Actions and habits: the development of behavioural autonomy , 1985 .

[4]  R. Rescorla,et al.  Instrumental responding remains sensitive to reinforcer devaluation after extensive training , 1985 .

[5]  R. Rescorla,et al.  The role of response-reinforcer associations increases throughout extended instrumental training , 1988 .

[6]  E. Rolls,et al.  Emotion-related learning in patients with social and emotional changes associated with frontal lobe damage. , 1994, Journal of neurology, neurosurgery, and psychiatry.

[7]  Karl J. Friston,et al.  Value-dependent selection in the brain: Simulation in a synthetic neural model , 1994, Neuroscience.

[8]  A. Damasio,et al.  Insensitivity to future consequences following damage to human prefrontal cortex , 1994, Cognition.

[9]  R. Boakes,et al.  Motivational control after extended instrumental training , 1995 .

[10]  S. Carmichael,et al.  Connectional networks within the orbital and medial prefrontal cortex of macaque monkeys. , 1996, The Journal of comparative neurology.

[11]  B. Balleine,et al.  The role of incentive learning in instrumental outcome revaluation by sensory-specific satiety , 1998 .

[12]  B. Balleine,et al.  Goal-directed instrumental action: contingency and incentive learning and their cortical substrates , 1998, Neuropharmacology.

[13]  W. Schultz,et al.  Relative reward preference in primate orbitofrontal cortex , 1999, Nature.

[14]  E T Rolls,et al.  Sensory‐specific satiety‐related olfactory activation of the human orbitofrontal cortex , 2000, Neuroreport.

[15]  E. Murray,et al.  Control of Response Selection by Reinforcer Value Requires Interaction of Amygdala and Orbital Prefrontal Cortex , 2000, The Journal of Neuroscience.

[16]  A. Damasio,et al.  Emotion, decision making and the orbitofrontal cortex. , 2000, Cerebral cortex.

[17]  J. O'Doherty,et al.  Neural Responses during Anticipation of a Primary Taste Reward , 2002, Neuron.

[18]  P. Dayan,et al.  Reward, Motivation, and Reinforcement Learning , 2002, Neuron.

[19]  R Turner,et al.  Optimized EPI for fMRI studies of the orbitofrontal cortex , 2003, NeuroImage.

[20]  G. Schoenbaum,et al.  Encoding Predicted Outcome and Acquired Value in Orbitofrontal Cortex during Cue Sampling Depends upon Input from Basolateral Amygdala , 2003, Neuron.

[21]  M. Farah,et al.  Ventromedial frontal cortex mediates affective shifting in humans: evidence from a reversal learning paradigm. , 2003, Brain : a journal of neurology.

[22]  J. O'Doherty,et al.  Encoding Predictive Reward Value in Human Amygdala and Orbitofrontal Cortex , 2003, Science.

[23]  B. Balleine,et al.  The role of prelimbic cortex in instrumental conditioning , 2003, Behavioural Brain Research.

[24]  E. Rolls,et al.  Changes in emotion after circumscribed surgical lesions of the orbitofrontal and cingulate cortices. , 2003, Brain : a journal of neurology.

[25]  Keiji Tanaka,et al.  Neuronal Correlates of Goal-Based Motor Selection in the Prefrontal Cortex , 2003, Science.

[26]  J. O'Doherty,et al.  Dissociating Valence of Outcome from Behavioral Control in Human Orbital and Ventral Prefrontal Cortices , 2003, The Journal of Neuroscience.

[27]  J. Parkinson,et al.  Dissociable Contributions of the Human Amygdala and Orbitofrontal Cortex to Incentive Motivation and Goal Selection , 2003, The Journal of Neuroscience.

[28]  S. Killcross,et al.  Coordination of actions and habits in the medial prefrontal cortex of rats. , 2003, Cerebral cortex.

[29]  P. Holland Relations between Pavlovian-instrumental transfer and reinforcer devaluation. , 2004, Journal of experimental psychology. Animal behavior processes.

[30]  Saori C. Tanaka,et al.  Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops , 2004, Nature Neuroscience.

[31]  M. Walton,et al.  Interactions between decision making and performance monitoring within prefrontal cortex , 2004, Nature Neuroscience.

[32]  Karl J. Friston,et al.  Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning , 2004, Science.

[33]  R. Dolan,et al.  Human orbitofrontal cortex mediates extinction learning while accessing conditioned representations of value , 2004, Nature Neuroscience.

[34]  E. Murray,et al.  Bilateral Orbital Prefrontal Cortex Lesions in Rhesus Monkeys Disrupt Choices Guided by Both Reward Value and Reward Contingency , 2004, The Journal of Neuroscience.

[35]  B. Balleine,et al.  Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning , 2004, The European journal of neuroscience.

[36]  S. Thorpe,et al.  The orbitofrontal cortex: Neuronal activity in the behaving monkey , 2004, Experimental Brain Research.

[37]  M. Roesch,et al.  Orbitofrontal Cortex, Associative Learning, and Expectancies , 2005, Neuron.

[38]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[39]  B. Balleine,et al.  Lesions of Medial Prefrontal Cortex Disrupt the Acquisition But Not the Expression of Goal-Directed Learning , 2005, The Journal of Neuroscience.

[40]  B. Balleine,et al.  The role of the dorsomedial striatum in instrumental conditioning , 2005, The European journal of neuroscience.

[41]  C. Padoa-Schioppa,et al.  Neurons in the orbitofrontal cortex encode economic value , 2006, Nature.

[42]  J. O'Doherty,et al.  Is Avoiding an Aversive Outcome Rewarding? Neural Substrates of Avoidance Learning in the Human Brain , 2006, PLoS biology.

[43]  P. Dayan,et al.  Cortical substrates for exploratory decisions in humans , 2006, Nature.

[44]  M. Quirk,et al.  Representation of Spatial Goals in Rat Orbitofrontal Cortex , 2006, Neuron.

[45]  M. Roesch,et al.  Encoding of Time-Discounted Rewards in Orbitofrontal Cortex Is Independent of Value Representation , 2006, Neuron.