Decoding the Formation of Reward Predictions across Learning

The predicted reward of different behavioral options plays an important role in guiding decisions. Previous research has identified reward predictions in prefrontal and striatal brain regions. Moreover, it has been shown that the neural representation of a predicted reward is similar to the neural representation of the actual reward outcome. However, it has remained unknown how these representations emerge over the course of learning and how they relate to decision making. Here, we sought to investigate learning of predicted reward representations using functional magnetic resonance imaging and multivariate pattern classification. Using a pavlovian conditioning procedure, human subjects learned multiple novel cue–outcome associations in each scanning run. We demonstrate that across learning activity patterns in the orbitofrontal cortex, the dorsolateral prefrontal cortex (DLPFC), and the dorsal striatum, coding the value of predicted rewards become similar to the patterns coding the value of actual reward outcomes. Furthermore, we provide evidence that predicted reward representations in the striatum precede those in prefrontal regions and that representations in the DLPFC are linked to subsequent value-based choices. Our results show that different brain regions represent outcome predictions by eliciting the neural representation of the actual outcome. Furthermore, they suggest that reward predictions in the DLPFC are directly related to value-based choices.

[1]  R. Rescorla,et al.  A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[2]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[3]  J. Swets Indices of discrimination or diagnostic accuracy: their ROCs and implied models. , 1986, Psychological bulletin.

[4]  Masataka Watanabe Reward expectancy in primate prefrental neurons , 1996, Nature.

[5]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[6]  G. Schoenbaum,et al.  Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning , 1998, Nature Neuroscience.

[7]  W. Schultz,et al.  Relative reward preference in primate orbitofrontal cortex , 1999, Nature.

[8]  D. Kahneman,et al.  Functional Imaging of Neural Responses to Expectancy and Experience of Monetary Gains and Losses tasks with monetary payoffs , 2001 .

[9]  K. C. Anderson,et al.  Single neurons in prefrontal cortex encode abstract rules , 2001, Nature.

[10]  Brian Knutson,et al.  Anticipation of Increasing Monetary Reward Selectively Recruits Nucleus Accumbens , 2001, The Journal of Neuroscience.

[11]  J. O'Doherty,et al.  Neural Responses during Anticipation of a Primary Taste Reward , 2002, Neuron.

[12]  H. Critchley,et al.  Fear Conditioning in Humans The Influence of Awareness and Autonomic Arousal on Functional Neuroanatomy , 2002, Neuron.

[13]  J. O'Doherty,et al.  Encoding Predictive Reward Value in Human Amygdala and Orbitofrontal Cortex , 2003, Science.

[14]  Leslie G. Ungerleider,et al.  A general mechanism for perceptual decision-making in the human brain , 2004, Nature.

[15]  Karl J. Friston,et al.  Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning , 2004, Science.

[16]  D. Barraclough,et al.  Prefrontal cortex and decision making in a mixed-strategy game , 2004, Nature Neuroscience.

[17]  M. Roesch,et al.  Neuronal Activity Related to Reward Value and Motivation in Primate Frontal Cortex , 2004, Science.

[18]  Peter Dayan,et al.  Temporal difference models describe higher-order learning in humans , 2004, Nature.

[19]  F. Tong,et al.  Decoding the visual and subjective contents of the human brain , 2005, Nature Neuroscience.

[20]  K. Doya,et al.  Representation of Action-Specific Reward Values in the Striatum , 2005, Science.

[21]  E. Miller,et al.  Different time courses of learning-related activity in the prefrontal cortex and striatum , 2005, Nature.

[22]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23]  G. Rees,et al.  Predicting the orientation of invisible stimuli from activity in human primary visual cortex , 2005, Nature Neuroscience.

[24]  C. Padoa-Schioppa,et al.  Neurons in the orbitofrontal cortex encode economic value , 2006, Nature.

[25]  J. O'Doherty,et al.  Predictive Neural Coding of Reward Preference Involves Dissociable Responses in Human Ventral Midbrain and Ventral Striatum , 2006, Neuron.

[26]  Jonathan D. Cohen,et al.  Imaging valuation models in human choice. , 2006, Annual review of neuroscience.

[27]  Rainer Goebel,et al.  Information-based functional brain mapping. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Sean M. Polyn,et al.  Beyond mind-reading: multi-voxel pattern analysis of fMRI data , 2006, Trends in Cognitive Sciences.

[29]  R. Passingham,et al.  Reading Hidden Intentions in the Human Brain , 2007, Current Biology.

[30]  J. O'Doherty,et al.  Orbitofrontal Cortex Encodes Willingness to Pay in Everyday Economic Transactions , 2007, The Journal of Neuroscience.

[31]  J. O'Doherty,et al.  Neural coding of reward-prediction error signals during classical conditioning with attractive faces. , 2007, Journal of neurophysiology.

[32]  H. Seo,et al.  Mechanisms of Reinforcement Learning and Decision Making in the Primate Dorsolateral Prefrontal Cortex , 2007, Annals of the New York Academy of Sciences.

[33]  M. Brass,et al.  Unconscious determinants of free decisions in the human brain , 2008, Nature Neuroscience.

[34]  Michael X. Cohen,et al.  Dorsal Striatal–midbrain Connectivity in Humans Predicts How Reinforcements Are Used to Guide Decisions , 2009, Journal of Cognitive Neuroscience.

[35]  Jonathan D. Wallis,et al.  Neurons in the Frontal Lobe Encode the Value of Multiple Decision Variables , 2009, Journal of Cognitive Neuroscience.

[36]  George I. Christopoulos,et al.  Risk-dependent reward value signal in human prefrontal cortex , 2009, Proceedings of the National Academy of Sciences.

[37]  C. Daniel Salzman,et al.  The Convergence of Information about Rewarding and Aversive Stimuli in Single Neurons , 2009, The Journal of Neuroscience.

[38]  W. Schultz,et al.  Adaptation of Reward Sensitivity in Orbitofrontal Neurons , 2010, The Journal of Neuroscience.

[39]  Soyoung Q. Park,et al.  Prefrontal Cortex Fails to Learn from Reward Prediction Errors in Alcohol Dependence , 2010, The Journal of Neuroscience.

[40]  Soyoung Q. Park,et al.  The neural code of reward anticipation in human orbitofrontal cortex , 2010, Proceedings of the National Academy of Sciences.

[41]  Carol A. Seger,et al.  Category learning in the brain. , 2010, Annual review of neuroscience.

[42]  Nikolaus Kriegeskorte,et al.  How does an fMRI voxel sample the neuronal activity pattern: Compact-kernel or complex spatiotemporal filter? , 2010, NeuroImage.

[43]  Jascha D. Swisher,et al.  Multiscale Pattern Analysis of Orientation-Selective Activity in the Primary Visual Cortex , 2010, The Journal of Neuroscience.

[44]  Jakob Heinzle,et al.  Decoding different roles for vmPFC and dlPFC in multi-attribute decision making , 2011, NeuroImage.

[45]  J. Haynes,et al.  Perceptual Learning and Decision-Making in Human Medial Frontal Cortex , 2011, Neuron.

[46]  Soyoung Q. Park,et al.  Neurobiology of Value Integration: When Value Impacts Valuation , 2011, The Journal of Neuroscience.

[47]  F. Blankenburg,et al.  Causal Role of Dorsolateral Prefrontal Cortex in Human Perceptual Decision Making , 2011, Current Biology.

[48]  J. Haynes Brain Reading: Decoding Mental States From Brain Activity In Humans , 2011 .