A State Representation for Reinforcement Learning and Decision-Making in the Orbitofrontal Cortex

Despite decades of research, the exact ways in which the orbitofrontal cortex (OFC) influences cognitive function have remained mysterious. Anatomically, the OFC is characterized by remarkably broad connectivity to sensory, limbic and subcortical areas, and functional studies have implicated the OFC in a plethora of functions ranging from facial processing to value-guided choice. Notwithstanding such diversity of findings, much research suggests that one important function of the OFC is to support decision making and reinforcement learning. Here, we describe a novel theory that posits that OFC’s specific role in decision-making is to provide an up-to-date representation of task-related information, called a state representation. This representation reflects a mapping between distinct task states and sensory as well as unobservable information. We summarize evidence supporting the existence of such state representations in rodent and human OFC and argue that forming these state representations provides a crucial scaffold that allows animals to efficiently perform decision making and reinforcement learning in high-dimensional and partially observable environments. Finally, we argue that our theory offers an integrating framework for linking the diversity of functions ascribed to OFC and is in line with its wide ranging connectivity.

[1]  Y. Niv,et al.  Temporal Specificity of Reward Prediction Errors Signaled by Putative Dopamine Neurons in Rat VTA Depends on Ventral Striatum , 2016, Neuron.

[2]  Cavada,et al.  The mysterious orbitofrontal cortex. foreword , 2000, Cerebral cortex.

[3]  Michael L. Platt,et al.  Social Signals in Primate Orbitofrontal Cortex , 2012, Current Biology.

[4]  Howard Eichenbaum,et al.  Orbitofrontal Cortex Encodes Memories within Value-Based Schemas and Represents Contexts That Guide Memory Retrieval , 2015, The Journal of Neuroscience.

[5]  G. Schoenbaum,et al.  Lateral orbitofrontal neurons acquire responses to upshifted, downshifted, or blocked cues during unblocking , 2015, eLife.

[6]  G. F. Tremblay,et al.  The Prefrontal Cortex , 1989, Neurology.

[7]  Rajesh P. N. Rao,et al.  Decision Making Under Uncertainty: A Neural Model Based on Partially Observable Markov Decision Processes , 2010, Front. Comput. Neurosci..

[8]  Jesper Andersson,et al.  A multi-modal parcellation of human cerebral cortex , 2016, Nature.

[9]  Suzanne N. Haber,et al.  Circuit-Based Corticostriatal Homologies Between Rat and Primate , 2016, Biological Psychiatry.

[10]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[11]  E. Murray,et al.  Balkanizing the primate orbitofrontal cortex: distinct subregions for comparing and contrasting values , 2011, Annals of the New York Academy of Sciences.

[12]  Elizabeth A. West,et al.  Transient Inactivation of Orbitofrontal Cortex Blocks Reinforcer Devaluation in Macaques , 2011, The Journal of Neuroscience.

[13]  S. Gershman,et al.  Dopamine reward prediction errors reflect hidden state inference across time , 2017, Nature Neuroscience.

[14]  E. Rolls,et al.  Hunger and satiety modify the responses of olfactory and visual neurons in the primate orbitofrontal cortex. , 1996, Journal of neurophysiology.

[15]  D. R. Snyder,et al.  Effects of orbital frontal lesions on aversive and aggressive behaviors in rhesus monkeys. , 1970, Journal of comparative and physiological psychology.

[16]  C. Padoa-Schioppa,et al.  Neurons in the orbitofrontal cortex encode economic value , 2006, Nature.

[17]  Yael Niv,et al.  A Probability Distribution over Latent Causes, in the Orbitofrontal Cortex , 2016, The Journal of Neuroscience.

[18]  Y. Niv,et al.  Model-based predictions for dopamine , 2018, Current Opinion in Neurobiology.

[19]  Donald Michie,et al.  BOXES: AN EXPERIMENT IN ADAPTIVE CONTROL , 2013 .

[20]  Laura A. Bradfield,et al.  Medial Orbitofrontal Cortex Mediates Outcome Retrieval in Partially Observable Task Situations , 2015, Neuron.

[21]  G. Schoenbaum,et al.  Orbitofrontal neurons infer the value and identity of predicted outcomes , 2014, Nature Communications.

[22]  Timothy Edward John Behrens,et al.  Separate value comparison and learning mechanisms in macaque medial and lateral orbitofrontal cortex , 2010, Proceedings of the National Academy of Sciences.

[23]  G. Schoenbaum,et al.  Orbitofrontal Cortex and Representation of Incentive Value in Associative Learning , 1999, The Journal of Neuroscience.

[24]  C. Padoa-Schioppa,et al.  The representation of economic value in the orbitofrontal cortex is invariant for changes of menu , 2008, Nature Neuroscience.

[25]  Timothy Edward John Behrens,et al.  Two Anatomically and Computationally Distinct Learning Signals Predict Changes to Stimulus-Outcome Associations in Hippocampus , 2016, Neuron.

[26]  J. O'Doherty,et al.  Encoding Predictive Reward Value in Human Amygdala and Orbitofrontal Cortex , 2003, Science.

[27]  E. Rolls,et al.  The Orbitofrontal Cortex , 2019 .

[28]  Timothy E. J. Behrens,et al.  Giving credit where credit is due: orbitofrontal cortex and valuation in an uncertain world , 2011, Annals of the New York Academy of Sciences.

[29]  T. Robbins,et al.  Dissociable Contributions of the Orbitofrontal and Infralimbic Cortex to Pavlovian Autoshaping and Discrimination Reversal Learning: Further Evidence for the Functional Heterogeneity of the Rodent Frontal Cortex , 2003, The Journal of Neuroscience.

[30]  Robert C. Wilson,et al.  Orbitofrontal Cortex as a Cognitive Map of Task Space , 2014, Neuron.

[31]  K. Doya,et al.  Multiple Representations of Belief States and Action Values in Corticobasal Ganglia Loops , 2007, Annals of the New York Academy of Sciences.

[32]  L. Kamin Predictability, surprise, attention, and conditioning , 1967 .

[33]  Colin Camerer,et al.  Dissociating the Role of the Orbitofrontal Cortex and the Striatum in the Computation of Goal Values and Prediction Errors , 2008, The Journal of Neuroscience.

[34]  A. Bechara The role of emotion in decision-making: Evidence from neurological patients with orbitofrontal damage , 2004, Brain and Cognition.

[35]  Steven P. Wise,et al.  Forward frontal fields: phylogeny and fundamental function , 2008, Trends in Neurosciences.

[36]  Hatim A. Zariwala,et al.  Neural correlates, computation and behavioural impact of decision confidence , 2008, Nature.

[37]  J. O'Doherty,et al.  Orbitofrontal Cortex Encodes Willingness to Pay in Everyday Economic Transactions , 2007, The Journal of Neuroscience.

[38]  M Freedman,et al.  Orbitofrontal function, object alternation and perseveration. , 1998, Cerebral cortex.

[39]  D. Schutter,et al.  Increased positive emotional memory after repetitive transcranial magnetic stimulation over the orbitofrontal cortex. , 2006, Journal of psychiatry & neuroscience : JPN.

[40]  Nicolas W. Schuck,et al.  Human Orbitofrontal Cortex Represents a Cognitive Map of State Space , 2016, Neuron.

[41]  Janneke F. M. Jehee,et al.  Sensory uncertainty decoded from visual cortex predicts behavior , 2015, Nature Neuroscience.

[42]  T. Preuss Do Rats Have Prefrontal Cortex? The Rose-Woolsey-Akert Program Reconsidered , 1995, Journal of Cognitive Neuroscience.

[43]  Maria V. Sanchez-Vives,et al.  Lateral orbitofrontal cortex anticipates choices and integrates prior with current information , 2017, Nature Communications.

[44]  G. Schoenbaum,et al.  Does the orbitofrontal cortex signal value? , 2011, Annals of the New York Academy of Sciences.

[45]  N. Rempel-Clower,et al.  Role of Orbitofrontal Cortex Connections in Emotion , 2007, Annals of the New York Academy of Sciences.

[46]  P. Dayan,et al.  Decision theory, reinforcement learning, and the brain , 2008, Cognitive, affective & behavioral neuroscience.

[47]  G. Schoenbaum,et al.  Cholinergic Interneurons Use Orbitofrontal Input to Track Beliefs about Current State , 2016, The Journal of Neuroscience.

[48]  J. Hodges,et al.  The orbitofrontal cortex is involved in emotional enhancement of memory: evidence from the dementias. , 2013, Brain : a journal of neurology.

[49]  Adam Kepecs,et al.  Midbrain Dopamine Neurons Signal Belief in Choice Accuracy during a Perceptual Decision , 2017, Current Biology.

[50]  Geoffrey Schoenbaum,et al.  The role of the orbitofrontal cortex in the pursuit of happiness and more specific rewards , 2008, Nature.

[51]  J. Price,et al.  Architectonic subdivision of the human orbital and medial prefrontal cortex , 2003, The Journal of comparative neurology.

[52]  G. Schoenbaum,et al.  What the orbitofrontal cortex does not do , 2015, Nature Neuroscience.

[53]  M. Petrides The Orbitofrontal Cortex: Novelty, Deviation from Expectation, and Memory , 2007, Annals of the New York Academy of Sciences.

[54]  M. Kringelbach The human orbitofrontal cortex: linking reward to hedonic experience , 2005, Nature Reviews Neuroscience.

[55]  C. Cavada,et al.  The anatomical connections of the macaque monkey orbitofrontal cortex. A review. , 2000, Cerebral cortex.

[56]  R. Knight,et al.  The role of the orbitofrontal cortex in regulation of interpersonal space: evidence from frontal lesion and frontotemporal dementia patients. , 2016, Social cognitive and affective neuroscience.

[57]  T. Kahnt,et al.  Identity-Specific Reward Representations in Orbitofrontal Cortex Are Modulated by Selective Devaluation , 2017, The Journal of Neuroscience.

[58]  E. Murray,et al.  Bilateral Orbital Prefrontal Cortex Lesions in Rhesus Monkeys Disrupt Choices Guided by Both Reward Value and Reward Contingency , 2004, The Journal of Neuroscience.

[59]  K. Cicerone,et al.  Disturbance of social cognition after traumatic orbitofrontal brain injury. , 1997, Archives of clinical neuropsychology : the official journal of the National Academy of Neuropsychologists.

[60]  A. Damasio,et al.  Emotion, decision making and the orbitofrontal cortex. , 2000, Cerebral cortex.

[61]  Y. Niv,et al.  Discovering latent causes in reinforcement learning , 2015, Current Opinion in Behavioral Sciences.

[62]  Michael Petrides,et al.  Orbitofrontal Cortex and Memory Formation , 2002, Neuron.

[63]  D. Perrett,et al.  Beauty in a smile: the role of medial orbitofrontal cortex in facial attractiveness , 2003, Neuropsychologia.

[64]  J. Gläscher,et al.  Congruence of Inherent and Acquired Values Facilitates Reward-Based Decision-Making , 2016, The Journal of Neuroscience.

[65]  J. Bachevalier,et al.  Behavioral/systems/cognitive Selective Aspiration or Neurotoxic Lesions of Orbital Frontal Areas 11 and 13 Spared Monkeys' Performance on the Object Discrimination Reversal Task , 2022 .

[66]  C. Delgado,et al.  Personality disorder related to an acute orbitofrontal lesion in multiple sclerosis. , 2011, The Journal of neuropsychiatry and clinical neurosciences.

[67]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 2005, IEEE Transactions on Neural Networks.

[68]  Luke J. Chang,et al.  Connectivity-Based Parcellation of the Human Orbitofrontal Cortex , 2012, The Journal of Neuroscience.

[69]  Timothy E. J. Behrens,et al.  Review Frontal Cortex and Reward-guided Learning and Decision-making Figure 1. Frontal Brain Regions in the Macaque Involved in Reward-guided Learning and Decision-making Finer Grained Anatomical Divisions with Frontal Cortical Systems for Reward-guided Behavior , 2022 .

[70]  Jerald D. Kralik,et al.  Rhesus monkeys with orbital prefrontal cortex lesions can learn to inhibit prepotent responses in the reversed reward contingency task. , 2006, Cerebral cortex.

[71]  Lauren V. Kustner,et al.  Shaping of Object Representations in the Human Medial Temporal Lobe Based on Temporal Regularities , 2012, Current Biology.

[72]  M. Roesch,et al.  Orbitofrontal cortex, decision-making and drug addiction , 2006, Trends in Neurosciences.

[73]  A. Koulakov,et al.  Orbitofrontal Cortex Is Required for Optimal Waiting Based on Decision Confidence , 2014, Neuron.

[74]  S. Thorpe,et al.  The orbitofrontal cortex: Neuronal activity in the behaving monkey , 2004, Experimental Brain Research.

[75]  Robert C. Wilson,et al.  Reinforcement Learning in Multidimensional Environments Relies on Attention Mechanisms , 2015, The Journal of Neuroscience.

[76]  Robert C. Wilson,et al.  Expectancy-related changes in firing of dopamine neurons depend on orbitofrontal cortex , 2011, Nature Neuroscience.

[77]  Daphne Koller,et al.  Reinforcement Learning Using Approximate Belief States , 1999, NIPS.

[78]  Geoffrey Schoenbaum,et al.  Risk-Responsive Orbitofrontal Neurons Track Acquired Salience , 2013, Neuron.

[79]  Y. Niv,et al.  Ventral Striatum and Orbitofrontal Cortex Are Both Required for Model-Based, But Not Model-Free, Reinforcement Learning , 2011, The Journal of Neuroscience.

[80]  Steven L. Thorne,et al.  Rising mean IQ: Cognitive demand of mathematics education for young children, population exposure to formal schooling, and the neurobiology of the prefrontal cortex , 2005 .

[81]  Christian F. Doeller,et al.  The Role of Mental Maps in Decision-Making , 2017, Trends in Neurosciences.

[82]  Charles M. Butter,et al.  Perseveration in extinction and in discrimination reversal tasks following selective frontal ablations in Macaca mulatta , 1969 .

[83]  Nicolas W. Schuck,et al.  Medial Prefrontal Cortex Predicts Internally Driven Strategy Shifts , 2015, Neuron.

[84]  J. O'Doherty,et al.  Dissociating Valence of Outcome from Behavioral Control in Human Orbital and Ventral Prefrontal Cortices , 2003, The Journal of Neuroscience.

[85]  M. Mishkin,et al.  Effects of orbital frontal and anterior cingulate lesions on object and spatial memory in rhesus monkeys , 1997, Neuropsychologia.

[86]  Alicia Izquierdo,et al.  Comparison of the Effects of Bilateral Orbital Prefrontal Cortex Lesions and Amygdala Lesions on Emotional Responses in Rhesus Monkeys , 2005, The Journal of Neuroscience.

[87]  Thomas F Münte,et al.  Orbitofrontal Cortex Reactivity to Angry Facial Expression in a Social Interaction Correlates with Aggressive Behavior. , 2015, Cerebral cortex.

[88]  R. Elliott,et al.  Dissociable functions in the medial and lateral orbitofrontal cortex: evidence from human neuroimaging studies. , 2000, Cerebral cortex.

[89]  Aron K Barbey,et al.  Orbitofrontal contributions to human working memory. , 2011, Cerebral cortex.

[90]  W. Schultz,et al.  Relative reward preference in primate orbitofrontal cortex , 1999, Nature.

[91]  P. Tobler,et al.  Identity-specific coding of future rewards in the human orbitofrontal cortex , 2015, Proceedings of the National Academy of Sciences.

[92]  J. O'Doherty,et al.  What We Know and Do Not Know about the Functions of the Orbitofrontal Cortex after 20 Years of Cross-Species Studies , 2007, The Journal of Neuroscience.

[93]  David S. Touretzky,et al.  Representation and Timing in Theories of the Dopamine System , 2006, Neural Computation.

[94]  G. Schoenbaum,et al.  Information coding in the rodent prefrontal cortex. II. Ensemble activity in orbitofrontal cortex. , 1995, Journal of neurophysiology.

[95]  Jonathan W. Pillow,et al.  A Bayesian method for reducing bias in neural representational similarity analysis , 2016, bioRxiv.

[96]  M. Rushworth,et al.  Does the medial orbitofrontal cortex have a role in social valuation? , 2010, The European journal of neuroscience.

[97]  Mortimer Mishkin,et al.  A re-examination of the effects of frontal lesions on object alternation , 1969 .

[98]  Timothy Edward John Behrens,et al.  Separable Learning Systems in the Macaque Brain and the Role of Orbitofrontal Cortex in Contingent Learning , 2010, Neuron.

[99]  Joshua L. Jones,et al.  Orbitofrontal neurons acquire responses to 'valueless' Pavlovian cues during unblocking. , 2014, eLife.

[100]  M. Botvinick,et al.  The hippocampus as a predictive map , 2016 .

[101]  Erin L. Rich,et al.  Decoding subjective decisions from orbitofrontal cortex , 2016, Nature Neuroscience.

[102]  Yael Niv,et al.  Reinforcement learning with Marr , 2016, Current Opinion in Behavioral Sciences.

[103]  Alumit Ishai,et al.  Sex, beauty and the orbitofrontal cortex. , 2007, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[104]  H. D. Steklis,et al.  The effects of orbitofrontal lesions on the aggressive behavior of vervet monkeys (Cercopithecus aethiops sabaeus) , 1979, Experimental Neurology.

[105]  Geoffrey Schoenbaum,et al.  Different Roles for Orbitofrontal Cortex and Basolateral Amygdala in a Reinforcer Devaluation Task , 2003, The Journal of Neuroscience.

[106]  G. Schoenbaum,et al.  Basolateral Amygdala Lesions Abolish Orbitofrontal-Dependent Reversal Impairments , 2007, Neuron.

[107]  James M. Kilner,et al.  Brain systems for assessing facial attractiveness , 2007, Neuropsychologia.

[108]  Angela Sirigu,et al.  Modulation of value representation by social context in the primate orbitofrontal cortex , 2012, Proceedings of the National Academy of Sciences.

[109]  R. Saunders,et al.  Prefrontal mechanisms of behavioral flexibility, emotion regulation and value updating , 2013, Nature Neuroscience.