Dissociable neural representations of reinforcement and belief prediction errors underlie strategic learning

Decision-making in the presence of other competitive intelligent agents is fundamental for social and economic behavior. Such decisions require agents to behave strategically, where in addition to learning about the rewards and punishments available in the environment, they also need to anticipate and respond to actions of others competing for the same rewards. However, whereas we know much about strategic learning at both theoretical and behavioral levels, we know relatively little about the underlying neural mechanisms. Here, we show using a multi-strategy competitive learning paradigm that strategic choices can be characterized by extending the reinforcement learning (RL) framework to incorporate agents’ beliefs about the actions of their opponents. Furthermore, using this characterization to generate putative internal values, we used model-based functional magnetic resonance imaging to investigate neural computations underlying strategic learning. We found that the distinct notions of prediction errors derived from our computational model are processed in a partially overlapping but distinct set of brain regions. Specifically, we found that the RL prediction error was correlated with activity in the ventral striatum. In contrast, activity in the ventral striatum, as well as the rostral anterior cingulate (rACC), was correlated with a previously uncharacterized belief-based prediction error. Furthermore, activity in rACC reflected individual differences in degree of engagement in belief learning. These results suggest a model of strategic behavior where learning arises from interaction of dissociable reinforcement and belief-based inputs.

[1]  Karl J. Friston,et al.  Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning , 2004, Science.

[2]  Colin Camerer,et al.  Social neuroeconomics: the neural circuitry of social preferences , 2007, Trends in Cognitive Sciences.

[3]  A. Roth,et al.  Learning in Extensive-Form Games: Experimental Data and Simple Dynamic Models in the Intermediate Term* , 1995 .

[4]  R Turner,et al.  Optimized EPI for fMRI studies of the orbitofrontal cortex , 2003, NeuroImage.

[5]  Karl J. Friston,et al.  Statistical parametric maps in functional imaging: A general linear approach , 1994 .

[6]  R. McKelvey,et al.  Quantal Response Equilibria for Normal Form Games , 1995 .

[7]  Karl J. Friston,et al.  Mixed-effects and fMRI studies , 2005, NeuroImage.

[8]  Daeyeol Lee,et al.  Distributed Coding of Actual and Hypothetical Outcomes in the Orbital and Dorsolateral Prefrontal Cortex , 2011, Neuron.

[9]  C. Frith,et al.  Meeting of minds: the medial frontal cortex and social cognition , 2006, Nature Reviews Neuroscience.

[10]  Nathaniel T. Wilcox,et al.  Theories of Learning in Games and Heterogeneity Bias , 2006 .

[11]  John M. Pearson,et al.  Fictive Reward Signals in the Anterior Cingulate Cortex , 2009, Science.

[12]  P. Montague,et al.  Neuroeconomic Approaches to Mental Disorders , 2010, Neuron.

[13]  Jonathan D. Cohen,et al.  Conflict monitoring and anterior cingulate cortex: an update , 2004, Trends in Cognitive Sciences.

[14]  Teck-Hua Ho,et al.  Experience-Weighted Attraction Learning in Games: A Unifying Approach , 1997 .

[15]  Josef Hofbauer,et al.  Evolutionary Games and Population Dynamics , 1998 .

[16]  Matthew F S Rushworth,et al.  The Computation of Social Behavior , 2009, Science.

[17]  P. Dayan,et al.  Cortical substrates for exploratory decisions in humans , 2006, Nature.

[18]  Mark W Woolrich,et al.  Associative learning of social value , 2008, Nature.

[19]  Camelia M. Kuhnen,et al.  The Neural Basis of Financial Risk Taking , 2005, Neuron.

[20]  W. Schultz,et al.  Neural mechanisms of observational learning , 2010, Proceedings of the National Academy of Sciences.

[21]  M. Delgado,et al.  Perceptions of moral character modulate the neural systems of reward during the trust game , 2005, Nature Neuroscience.

[22]  Peter Bossaerts,et al.  Neural correlates of mentalizing-related computations during strategic interactions in humans , 2008, Proceedings of the National Academy of Sciences.

[23]  M. Dorris,et al.  Role of the Superior Colliculus in Choosing Mixed-Strategy Saccades , 2009, The Journal of Neuroscience.

[24]  R. Nagel,et al.  Neural correlates of depth of strategic reasoning in medial prefrontal cortex , 2009, Proceedings of the National Academy of Sciences.

[25]  Samuel M. McClure,et al.  Temporal Prediction Errors in a Passive Learning Task Activate Human Striatum , 2003, Neuron.

[26]  Colin Camerer Behavioral Game Theory: Experiments in Strategic Interaction , 2003 .

[27]  B. Vogt Pain and emotion interactions in subregions of the cingulate gyrus , 2005, Nature Reviews Neuroscience.

[28]  Colin Camerer,et al.  Experience‐weighted Attraction Learning in Normal Form Games , 1999 .

[29]  P. Glimcher,et al.  Activity in Posterior Parietal Cortex Is Correlated with the Relative Subjective Desirability of Action , 2004, Neuron.

[30]  Kevin McCabe,et al.  Neural signature of fictive learning signals in a sequential investment task , 2007, Proceedings of the National Academy of Sciences.

[31]  J. O'Doherty,et al.  Model‐Based fMRI and Its Application to Reward Learning and Decision Making , 2007, Annals of the New York Academy of Sciences.

[32]  Clay B. Holroyd,et al.  The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity. , 2002, Psychological review.

[33]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[34]  Timothy C. Salmon An Evaluation of Econometric Models of Adaptive Learning , 2001 .

[35]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[36]  M. F. Luce,et al.  Separate Neural Mechanisms Underlie Choices and Strategic Preferences in Risky Decision Making , 2009, Neuron.

[37]  S. Quartz,et al.  Getting to Know You: Reputation and Trust in a Two-Person Economic Exchange , 2005, Science.

[38]  A. Rapoport,et al.  Mixed strategies and iterative elimination of strongly dominated strategies: an experimental investi , 2000 .

[39]  M. Posner,et al.  Cognitive and emotional influences in anterior cingulate cortex , 2000, Trends in Cognitive Sciences.

[40]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[41]  Teck-Hua Ho,et al.  Self-tuning experience weighted attraction learning in games , 2007, J. Econ. Theory.

[42]  D. Barraclough,et al.  Prefrontal cortex and decision making in a mixed-strategy game , 2004, Nature Neuroscience.

[43]  ปิยดา สมบัติวัฒนา Behavioral Game Theory: Experiments in Strategic Interaction , 2013 .

[44]  J. O'Doherty,et al.  Regret and its avoidance: a neuroimaging study of choice behavior , 2005, Nature Neuroscience.