Human Neural Learning Depends on Reward Prediction Errors in the Blocking Paradigm

Learning occurs when an outcome deviates from expectation (prediction error). According to formal learning theory, the defining paradigm demonstrating the role of prediction errors in learning is the blocking test. Here, a novel stimulus is blocked from learning when it is associated with a fully predicted outcome, presumably because the occurrence of the outcome fails to produce a prediction error. We investigated the role of prediction errors in human reward-directed learning using a blocking paradigm and measured brain activation with functional magnetic resonance imaging. Participants showed blocking of behavioral learning with juice rewards as predicted by learning theory. The medial orbitofrontal cortex and the ventral putamen showed significantly lower responses to blocked, compared with nonblocked, reward-predicting stimuli. In reward-predicting control situations, deactivations in orbitofrontal cortex and ventral putamen occurred at the time of unpredicted reward omissions. Responses in discrete parts of orbitofrontal cortex correlated with the degree of behavioral learning during, and after, the learning phase. These data suggest that learning in primary reward structures in the human brain correlates with prediction errors in a manner that complies with principles of formal learning theory.

[1]  E. Miller,et al.  Different time courses of learning-related activity in the prefrontal cortex and striatum , 2005, Nature.

[2]  Karl J. Friston,et al.  Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning , 2004, Science.

[3]  J. C. Crowley,et al.  Saccade Reward Signals in Posterior Cingulate Cortex , 2003, Neuron.

[4]  G. Schoenbaum,et al.  Encoding Predicted Outcome and Acquired Value in Orbitofrontal Cortex during Cue Sampling Depends upon Input from Basolateral Amygdala , 2003, Neuron.

[5]  Karl J. Friston,et al.  Temporal Difference Models and Reward-Related Learning in the Human Brain , 2003, Neuron.

[6]  Samuel M. McClure,et al.  Temporal Prediction Errors in a Passive Learning Task Activate Human Striatum , 2003, Neuron.

[7]  Brian Knutson,et al.  A region of mesial prefrontal cortex tracks monetarily rewarding outcomes: characterization with rapid event-related fMRI , 2003, NeuroImage.

[8]  D. A. Baxter,et al.  Operant Reward Learning in Aplysia: Neuronal Correlates and Mechanisms , 2002, Science.

[9]  J. O'Doherty,et al.  Neural Responses during Anticipation of a Primary Taste Reward , 2002, Neuron.

[10]  J. de Houwer,et al.  Associative learning of likes and dislikes: a review of 25 years of research on human evaluative conditioning. , 2001, Psychological bulletin.

[11]  J. M. Anderson,et al.  Responses of human frontal cortex to surprising events are predicted by formal associative learning theory , 2001, Nature Neuroscience.

[12]  J. Wickens,et al.  A cellular mechanism of reward-related learning , 2001, Nature.

[13]  N. Logothetis,et al.  Neurophysiological investigation of the basis of the fMRI signal , 2001, Nature.

[14]  M. Merzenich,et al.  Cortical remodelling induced by activity of ventral tegmental dopamine neurons , 2001, Nature.

[15]  W. Schultz,et al.  Dopamine responses comply with basic assumptions of formal learning theory , 2001, Nature.

[16]  Anthony Dickinson,et al.  The 28th Bartlett Memorial Lecture. Causal learning: an associative analysis. , 2001 .

[17]  W. Schultz Multiple reward signals in the brain , 2000, Nature Reviews Neuroscience.

[18]  P. Matthews,et al.  Learning about pain: the neural substrate of the prediction error for aversive events. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[19]  W. Schultz,et al.  Reward-related neuronal activity during go-nogo task performance in primate orbitofrontal cortex. , 2000, Journal of neurophysiology.

[20]  W. Schultz,et al.  Modifications of reward expectation-related neuronal activity during learning in primate orbitofrontal cortex. , 2000, Journal of neurophysiology.

[21]  F. Amenta,et al.  LOCALIZATION OF DOPAMINE RECEPTOR SUBTYPES IN SYSTEMIC ARTERIES , 2000, Clinical and experimental hypertension.

[22]  O Josephs,et al.  Event-related functional magnetic resonance imaging: modelling, inference and optimization. , 1999, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[23]  P S Goldman-Rakic,et al.  Widespread origin of the primate mesofrontal dopamine system. , 1998, Cerebral cortex.

[24]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[25]  D. Heeger,et al.  Linear Systems Analysis of Functional Magnetic Resonance Imaging in Human V1 , 1996, The Journal of Neuroscience.

[26]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[27]  J. Wickens,et al.  Dopamine reverses the depression of rat corticostriatal synapses which normally follows high-frequency stimulation of cortex In vitro , 1996, Neuroscience.

[28]  P. Lovibond,et al.  Blocking in Human Electrodermal Conditioning , 1995, The Quarterly journal of experimental psychology. B, Comparative and physiological psychology.

[29]  Ritske de Jong,et al.  THE ROLE OF PREPARATION IN OVERLAPPING-TASK PERFORMANCE , 1995 .

[30]  S. N. Haber,et al.  The organization of midbrain projections to the ventral striatum in the primate , 1994, Neuroscience.

[31]  S. Young,et al.  5-hydroxydopamine-labeled dopaminergic axns: Three-dimensional reconstructions of axons, synapses and postsynaptic targets in rat neostriatum , 1994, Neuroscience.

[32]  I. Martin,et al.  Blocking Observed in Human Eyelid Conditioning , 1991, The Quarterly journal of experimental psychology. B, Comparative and physiological psychology.

[33]  A. Hughes,et al.  The action of a dopamine (DA1) receptor agonist, fenoldopam in human vasculature in vivo and in vitro. , 1986, British Journal of Clinical Pharmacology.

[34]  A G Barto,et al.  Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.

[35]  J. Pearce,et al.  A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. , 1980, Psychological review.

[36]  N. Mackintosh A Theory of Attention: Variations in the Associability of Stimuli with Reinforcement , 1975 .

[37]  L. Kamin Predictability, surprise, attention, and conditioning , 1967 .

[38]  S. Thorpe,et al.  The orbitofrontal cortex: Neuronal activity in the behaving monkey , 2004, Experimental Brain Research.

[39]  A. Dickinson,et al.  Neuronal coding of prediction errors. , 2000, Annual review of neuroscience.

[40]  A. Dale,et al.  Selective averaging of rapidly presented individual trials using fMRI , 1997, Human brain mapping.

[41]  A. Barto,et al.  Adaptive Critics and the Basal Ganglia , 1994 .

[42]  Karl J. Friston,et al.  Statistical parametric maps in functional imaging: A general linear approach , 1994 .

[43]  N. Tzourio,et al.  Functional Mapping of the Human Brain , 1993 .

[44]  W. Schultz,et al.  Responses of monkey dopamine neurons during learning of behavioral reactions. , 1992, Journal of neurophysiology.

[45]  R. Rescorla,et al.  A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[46]  W. F. Prokasy,et al.  Classical conditioning II: Current research and theory. , 1972 .

[47]  B. Campbell,et al.  Punishment and aversive behavior , 1969 .

[48]  W. Brown Animal Intelligence: Experimental Studies , 1912, Nature.