Global reward state affects learning and activity in raphe nucleus and anterior insula in monkeys

People and other animals learn the values of choices by observing the contingencies between them and their outcomes. However, decisions are not guided by choice-linked reward associations alone; macaques also maintain a memory of the general, average reward rate – the global reward state – in an environment. Remarkably, global reward state affects the way that each choice outcome is valued and influences future decisions so that the impact of both choice success and failure is different in rich and poor environments. Successful choices are more likely to be repeated but this is especially the case in rich environments. Unsuccessful choices are more likely to be abandoned but this is especially likely in poor environments. Functional magnetic resonance imaging (fMRI) revealed two distinct patterns of activity, one in anterior insula and one in the dorsal raphe nucleus, that track global reward state as well as specific outcome events. Wittmann and colleagues show that not only single outcome events but also the global reward state (GRS) impact learning in macaques; low GRS drives explorative choices. Analyses of macaque BOLD signal reveals that GRS impacts activity in the anterior insula as well as the dorsal raphe nucleus.

[1]  W. Brown Animal Intelligence: Experimental Studies , 1912, Nature.

[2]  E. Thorndike A PROOF OF THE LAW OF EFFECT. , 1933, Science.

[3]  L. Crespi Quantitative variation of incentive and performance in the white rat. , 1942 .

[4]  R. Rescorla,et al.  A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[5]  Anthony S. Bryk,et al.  Hierarchical Linear Models: Applications and Data Analysis Methods , 1992 .

[6]  J. Price,et al.  Limbic connections of the orbital and medial prefrontal cortex in macaque monkeys , 1995, The Journal of comparative neurology.

[7]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[8]  Muge M. Bakircioglu,et al.  Mapping visual cortex in monkeys and humans using surface-based atlases , 2001, Vision Research.

[9]  David S. Touretzky,et al.  Long-Term Reward Prediction in TD Models of the Dopamine System , 2002, Neural Computation.

[10]  Y. Miyashita,et al.  Functional MRI of Macaque Monkeys Performing a Cognitive Set-Shifting Task , 2002, Science.

[11]  Geoffrey Schoenbaum,et al.  Lesions of orbitofrontal cortex and basolateral amygdala complex disrupt acquisition of odor-guided discriminations and reversals. , 2003, Learning & memory.

[12]  Mark W. Woolrich,et al.  Advances in functional and structural MR image analysis and implementation as FSL , 2004, NeuroImage.

[13]  T. Robbins,et al.  Cognitive Inflexibility After Prefrontal Serotonin Depletion , 2004, Science.

[14]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15]  Steven L. Thorne,et al.  Rising mean IQ: Cognitive demand of mathematics education for young children, population exposure to formal schooling, and the neurobiology of the prefrontal cortex , 2005 .

[16]  P. Glimcher,et al.  JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2005, 84, 555–579 NUMBER 3(NOVEMBER) DYNAMIC RESPONSE-BY-RESPONSE MODELS OF MATCHING BEHAVIOR IN RHESUS MONKEYS , 2022 .

[17]  A. Roberts,et al.  Primate orbitofrontal cortex and adaptive behaviour , 2006, Trends in Cognitive Sciences.

[18]  W. Schultz Behavioral theories and the neurophysiology of reward. , 2006, Annual review of psychology.

[19]  P. Dayan,et al.  Tonic dopamine: opportunity costs and the control of response vigor , 2007, Psychopharmacology.

[20]  Brunilde Sansò,et al.  Can DiffServ guarantee IP QoS under failures? , 2005, IEEE Network.

[21]  T. Robbins,et al.  Cognitive inflexibility after prefrontal serotonin depletion is behaviorally and neurochemically specific. , 2006, Cerebral cortex.

[22]  Samuel M. McClure,et al.  Short-term memory traces for action bias in human reinforcement learning , 2007, Brain Research.

[23]  G. Schoenbaum,et al.  Reconciling the Roles of Orbitofrontal Cortex in Reversal Learning and the Encoding of Outcome Expectancies , 2007, Annals of the New York Academy of Sciences.

[24]  H. Seo,et al.  Temporal Filtering of Reward Signals in the Dorsal Anterior Cingulate Cortex during a Mixed-Strategy Game , 2007, The Journal of Neuroscience.

[25]  H. Seo,et al.  Dynamic signals related to choices and outcomes in the dorsolateral prefrontal cortex. , 2007, Cerebral cortex.

[26]  Steven P. Wise,et al.  Forward frontal fields: phylogeny and fundamental function , 2008, Trends in Neurosciences.

[27]  Karl J. Friston,et al.  Bayesian model selection for group studies , 2009, NeuroImage.

[28]  W. Vanduffel,et al.  Visual Field Map Clusters in Macaque Extrastriate Visual Cortex , 2009, The Journal of Neuroscience.

[29]  P. Dayan Prospective and retrospective temporal difference learning , 2009, Network.

[30]  Peter N. C. Mohr,et al.  Genetic variation in dopaminergic neuromodulation influences the ability to rapidly and flexibly adapt decisions , 2009, Proceedings of the National Academy of Sciences.

[31]  Timothy Edward John Behrens,et al.  Separate value comparison and learning mechanisms in macaque medial and lateral orbitofrontal cortex , 2010, Proceedings of the National Academy of Sciences.

[32]  R. Andersen,et al.  Space representation for eye movements is more contralateral in monkeys than in humans , 2010, Proceedings of the National Academy of Sciences.

[33]  Timothy Edward John Behrens,et al.  Separable Learning Systems in the Macaque Brain and the Role of Orbitofrontal Cortex in Contingent Learning , 2010, Neuron.

[34]  Raymond J. Dolan,et al.  Disentangling the Roles of Approach, Activation and Valence in Instrumental and Pavlovian Responding , 2011, PLoS Comput. Biol..

[35]  Tatsuo K Sato,et al.  Dopamine neurons learn to encode the long-term value of multiple future rewards , 2011, Proceedings of the National Academy of Sciences.

[36]  Nathaniel D. Daw,et al.  Trial-by-trial data analysis using computational models , 2011 .

[37]  H. Seo,et al.  A reservoir of time constants for memory traces in cortical neurons , 2011, Nature Neuroscience.

[38]  Timothy E. J. Behrens,et al.  Double dissociation of value computations in orbitofrontal and anterior cingulate neurons , 2011, Nature Neuroscience.

[39]  Christoph W Korn,et al.  How unrealistic optimism is maintained in the face of reality , 2011, Nature Neuroscience.

[40]  John M. Pearson,et al.  Neuronal basis of sequential foraging decisions in a patchy environment , 2011, Nature Neuroscience.

[41]  R. Passingham,et al.  The Neurobiology of the Prefrontal Cortex: Anatomy, Evolution, and the Origin of Insight , 2012 .

[42]  N. Uchida,et al.  The dorsomedial striatum encodes net expected return, critical for energizing performance vigor , 2013, Nature Neuroscience.

[43]  Daeyeol Lee,et al.  Cortical Signals for Rewarded Actions and Strategic Exploration , 2013, Neuron.

[44]  R. Saunders,et al.  Prefrontal mechanisms of behavioral flexibility, emotion regulation and value updating , 2013, Nature Neuroscience.

[45]  R. Dolan,et al.  Losing the rose tinted glasses: neural substrates of unbiased belief updating in depression , 2014, Front. Hum. Neurosci..

[46]  K. Sakai,et al.  Autonomous Mechanism of Internal Choice Estimate Underlies Decision Inertia , 2014, Neuron.

[47]  Wim Vanduffel,et al.  The Retinotopic Organization of Macaque Occipitotemporal Cortex Anterior to V4 and Caudoventral to the Middle Temporal (MT) Cluster , 2014, The Journal of Neuroscience.

[48]  M. Khamassi,et al.  Accounting for Negative Automaintenance in Pigeons: A Dual Learning Systems Approach and Factored Representations , 2014, PloS one.

[49]  M. Khamassi,et al.  Contextual modulation of value signals in reward and punishment learning , 2015, Nature Communications.

[50]  G. Schoenbaum,et al.  What the orbitofrontal cortex does not do , 2015, Nature Neuroscience.

[51]  Matthew F.S. Rushworth,et al.  Contrasting Roles for Orbitofrontal Cortex and Amygdala in Credit Assignment and Learning in Macaques , 2015, Neuron.

[52]  Timothy Edward John Behrens,et al.  Value, search, persistence and model updating in anterior cingulate cortex , 2016, Nature Neuroscience.

[53]  Mathias Pessiglione,et al.  How prior preferences determine decision-making frames and biases in the human brain , 2016, eLife.

[54]  Timothy Edward John Behrens,et al.  Reward-Guided Learning with and without Causal Attribution , 2016, Neuron.

[55]  Nils Kolling,et al.  Predictive decision making driven by multiple time-linked reward representations in the anterior cingulate cortex , 2016, Nature Communications.

[56]  Z. Mainen,et al.  Activity patterns of serotonin neurons underlying cognitive flexibility , 2017, eLife.

[57]  Jacqueline Scholl,et al.  Simultaneous representation of a spectrum of dynamically changing value estimates during decision making , 2017, Nature Communications.

[58]  Mel W. Khaw,et al.  Reminders of past choices bias decisions for reward in humans , 2017, Nature Communications.

[59]  Vincent D Costa,et al.  Motivational neural circuits underlying reinforcement learning , 2017, Nature Neuroscience.

[60]  Elisabeth A. Murray,et al.  Specialized Representations of Value in the Orbital and Ventrolateral Prefrontal Cortex: Desirability versus Availability of Outcomes , 2017, Neuron.

[61]  Bolton K. H. Chau,et al.  Inverted activity patterns in ventromedial prefrontal cortex during value-guided decision-making in a less-is-more task , 2017, Nature Communications.

[62]  Lesley K Fellows,et al.  Contrasting Effects of Medial and Lateral Orbitofrontal Cortex Lesions on Credit Assignment and Decision-Making in Humans , 2017, The Journal of Neuroscience.

[63]  C. H. Donahue,et al.  Metaplasticity as a Neural Substrate for Adaptive Learning and Choice under Uncertainty , 2017, Neuron.

[64]  Karen J. Mullinger,et al.  Spatiotemporal neural characterization of prediction error valence and surprise during reward learning in humans , 2017, Scientific Reports.

[65]  Laurence T. Hunt,et al.  Triple Dissociation of Attention and Decision Computations across Prefrontal Cortex , 2017, Nature Neuroscience.

[66]  Madalena S. Fonseca,et al.  An effect of serotonergic stimulation on learning rates for rewards apparent after long intertrial intervals , 2018, Nature Communications.

[67]  Bolton K. H. Chau,et al.  The macaque anterior cingulate cortex translates counterfactual choice value into actual behavioral change , 2018, Nature Neuroscience.

[68]  M. Philiastides,et al.  Neural correlates of weighted reward prediction error during reinforcement learning classify response to cognitive behavioral therapy in depression , 2019, Science Advances.

[69]  J. Krebs,et al.  Foraging Theory , 2019 .

[70]  Peter M. Kaskan,et al.  Gustatory responses in macaque monkeys revealed with fMRI: Comments on taste, taste preference, and internal state , 2019, NeuroImage.

[71]  Adam G. Thomas,et al.  Behavioral flexibility is associated with changes in structure and function distributed across a frontal cortical network in macaques , 2019, bioRxiv.