Dopaminergic and frontal signals for decisions guided by sensory evidence and reward value

Making a decision often requires combining uncertain sensory evidence with learned reward values. It is not known how the brain performs this combination, and learns from the outcome of the resulting decisions. We trained mice in a decision task that requires combining visual evidence with recent reward values. Mice combined these factors efficiently: their decisions were guided by past rewards when visual stimuli provided uncertain evidence, but not when they were highly visible. The sequence of decisions was well described by a model that learns the values of stimulus-action pairs and combines them with sensory evidence. The model estimates how sensory evidence and reward value determine two key internal variables: the expected value of each decision and the prediction errors. We found that the first variable is explicitly represented in the activity of neuronal populations in prelimbic frontal cortex (PL), which occurred during choice execution. The second variable was explicitly represented in the activity of dopamine neurons of ventral tegmental area (VTA), which occurred after stimulus presentation and after choice outcome. As predicted by the model, optogenetic manipulations of dopamine neurons altered future choices mainly when the sensory evidence was weak, establishing the causal role of these neurons in guiding choices informed by combinations of rewards and sensory evidence. These results provide a unified, quantitative framework for how the brain makes efficient choices when challenged with internal and environmental uncertainty.

[1]  C. Padoa-Schioppa,et al.  Neurons in the orbitofrontal cortex encode economic value , 2006, Nature.

[2]  C. Petersen,et al.  Reward-Based Learning Drives Rapid Sensory Signals in Medial Prefrontal Cortex and Dorsal Hippocampus Necessary for Goal-Directed Behavior , 2018, Neuron.

[3]  James L. McClelland,et al.  Integration of Sensory and Reward Information during Perceptual Decision-Making in Lateral Intraparietal Cortex (LIP) of the Macaque Monkey , 2010, PloS one.

[4]  R. Romo,et al.  Neuronal correlates of sensory discrimination in the somatosensory cortex. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[5]  C. Fiorillo,et al.  Optogenetic Mimicry of the Transient Activation of Dopamine Neurons by Natural Reward Is Sufficient for Operant Reinforcement , 2012, PloS one.

[6]  A. Koulakov,et al.  Orbitofrontal Cortex Is Required for Optimal Waiting Based on Decision Confidence , 2014, Neuron.

[7]  S. Sesack,et al.  Projections from the Rat Prefrontal Cortex to the Ventral Tegmental Area: Target Specificity in the Synaptic Associations with Mesoaccumbens and Mesocortical Neurons , 2000, The Journal of Neuroscience.

[8]  Long Ding,et al.  Ongoing, rational calibration of reward-driven perceptual biases , 2018, bioRxiv.

[9]  B. Balleine,et al.  The role of prelimbic cortex in instrumental conditioning , 2003, Behavioural Brain Research.

[10]  J. DiCarlo,et al.  Optogenetic and pharmacological suppression of spatial clusters of face neurons reveal their causal role in face gender discrimination , 2015, Proceedings of the National Academy of Sciences.

[11]  K. Doya,et al.  The computational neurobiology of learning and reward , 2006, Current Opinion in Neurobiology.

[12]  Philip Holmes,et al.  Can Monkeys Choose Optimally When Faced with Noisy Stimuli and Unequal Rewards? , 2009, PLoS Comput. Biol..

[13]  S. Gershman,et al.  The Medial Prefrontal Cortex Shapes Dopamine Reward Prediction Errors under State Uncertainty , 2018, Neuron.

[14]  Anne E Carpenter,et al.  Neuron-type specific signals for reward and punishment in the ventral tegmental area , 2011, Nature.

[15]  G. Buzsáki,et al.  A 4 Hz Oscillation Adaptively Synchronizes Prefrontal, VTA, and Hippocampal Activities , 2011, Neuron.

[16]  K. Deisseroth,et al.  Phasic Firing in Dopaminergic Neurons Is Sufficient for Behavioral Conditioning , 2009, Science.

[17]  K. Doya,et al.  Representation of Action-Specific Reward Values in the Striatum , 2005, Science.

[18]  James M. Otis,et al.  Prefrontal cortex output circuits guide reward seeking through divergent cue encoding , 2017, Nature.

[19]  Elyssa B. Margolis,et al.  Ventral tegmental area: cellular heterogeneity, connectivity and behaviour , 2017, Nature Reviews Neuroscience.

[20]  Rajesh P. N. Rao,et al.  Decision Making Under Uncertainty: A Neural Model Based on Partially Observable Markov Decision Processes , 2010, Front. Comput. Neurosci..

[21]  Zengcai V. Guo,et al.  Flow of Cortical Activity Underlying a Tactile Decision in Mice , 2014, Neuron.

[22]  Christopher Summerfield,et al.  Building Bridges between Perceptual and Economic Decision-Making: Neural and Computational Mechanisms , 2012, Front. Neurosci..

[23]  Shawn R. Olsen,et al.  Gain control by layer six in cortical circuits of vision , 2012, Nature.

[24]  Keiji Tanaka,et al.  Neuronal Correlates of Goal-Based Motor Selection in the Prefrontal Cortex , 2003, Science.

[25]  Kevin J. Miller,et al.  Value representations in the rodent orbitofrontal cortex drive learning, not choice , 2018, bioRxiv.

[26]  Raag D. Airan,et al.  Natural Neural Projection Dynamics Underlying Social Behavior , 2014, Cell.

[27]  Gary Aston-Jones,et al.  Prefrontal neurons encode context-based response execution and inhibition in reward seeking and extinction , 2015, Proceedings of the National Academy of Sciences.

[28]  Jonathan D. Wallis,et al.  Neurons in the Frontal Lobe Encode the Value of Multiple Decision Variables , 2009, Journal of Cognitive Neuroscience.

[29]  M. Roesch,et al.  The Orbitofrontal Cortex and Ventral Tegmental Area Are Necessary for Learning from Unexpected Outcomes , 2009, Neuron.

[30]  Vaughn L. Hetrick,et al.  Mesolimbic Dopamine Signals the Value of Work , 2015, Nature Neuroscience.

[31]  Yang Dan,et al.  Cell-Type-Specific Activity in Prefrontal Cortex during Goal-Directed Behavior , 2015, Neuron.

[32]  H. Seo,et al.  Neural basis of reinforcement learning and decision making. , 2012, Annual review of neuroscience.

[33]  Kenneth D. Harris,et al.  High-Yield Methods for Accurate Two-Alternative Visual Psychophysics in Head-Fixed Mice , 2016, bioRxiv.

[34]  M. Shadlen,et al.  Effect of Expected Reward Magnitude on the Response of Neurons in the Dorsolateral Prefrontal Cortex of the Macaque , 1999, Neuron.

[35]  Hatim A. Zariwala,et al.  Neural correlates, computation and behavioural impact of decision confidence , 2008, Nature.

[36]  Z. Mainen,et al.  Distinct Sources of Deterministic and Stochastic Components of Action Timing Decisions in Rodent Frontal Cortex , 2016, Neuron.

[37]  E. Vaadia,et al.  Midbrain dopamine neurons encode decisions for future action , 2006, Nature Neuroscience.

[38]  Takeo Watanabe,et al.  Temporally Extended Dopamine Responses to Perceptually Demanding Reward-Predictive Stimuli , 2010, The Journal of Neuroscience.

[39]  Paul G. Middlebrooks,et al.  Neuronal Correlates of Metacognition in Primate Frontal Cortex , 2012, Neuron.

[40]  Ilana B. Witten,et al.  Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target , 2016, Nature Neuroscience.

[41]  Bingni W. Brunton,et al.  Distinct relationships of parietal and prefrontal cortices to evidence accumulation , 2014, Nature.

[42]  M. Sahani,et al.  Implicit knowledge of visual uncertainty guides decisions with asymmetric outcomes. , 2008, Journal of vision.

[43]  L. Wilbrecht,et al.  Transient stimulation of distinct subpopulations of striatal neurons mimics changes in action value , 2012, Nature Neuroscience.

[44]  Josiah R. Boivin,et al.  A Causal Link Between Prediction Errors, Dopamine Neurons and Learning , 2013, Nature Neuroscience.

[45]  B. Balleine,et al.  Lesions of Medial Prefrontal Cortex Disrupt the Acquisition But Not the Expression of Goal-Directed Learning , 2005, The Journal of Neuroscience.

[46]  Wei-Xing Shi,et al.  Behavioral/systems/cognitive Functional Coupling between the Prefrontal Cortex and Dopamine Neurons in the Ventral Tegmental Area , 2022 .

[47]  W. Schultz Neuronal Reward and Decision Signals: From Theories to Data. , 2015, Physiological reviews.

[48]  Liqun Luo,et al.  Circuit Architecture of VTA Dopamine Neurons Revealed by Systematic Input-Output Mapping , 2015, Cell.

[49]  Naoshige Uchida,et al.  Arithmetic and local circuitry underlying dopamine prediction errors , 2015, Nature.

[50]  J. Gold,et al.  The neural basis of decision making. , 2007, Annual review of neuroscience.

[51]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[52]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[53]  J. Movshon,et al.  The analysis of visual motion: a comparison of neuronal and psychophysical performance , 1992, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[54]  Cyrille Rossant,et al.  Spike sorting for large, dense electrode arrays , 2015 .

[55]  Yutaka Komura,et al.  Responses of pulvinar neurons reflect a subject's confidence in visual categorization , 2013, Nature Neuroscience.

[56]  P. Glimcher,et al.  Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[57]  K. Shapiro,et al.  The contingent negative variation (CNV) event-related potential (ERP) predicts the attentional blink , 2008 .

[58]  R. Romo,et al.  Dopamine neurons code subjective sensory experience and uncertainty of perceptual decisions , 2011, Proceedings of the National Academy of Sciences.

[59]  Il Memming Park,et al.  Encoding and decoding in parietal cortex during sensorimotor decision-making , 2014, Nature Neuroscience.

[60]  Adam Kepecs,et al.  Categorical representations of decision-variables in orbitofrontal cortex , 2017, bioRxiv.

[61]  P. Dayan,et al.  Decision theory, reinforcement learning, and the brain , 2008, Cognitive, affective & behavioral neuroscience.

[62]  W. Newsome,et al.  Choosing the greater of two goods: neural currencies for valuation and decision making , 2005, Nature Reviews Neuroscience.

[63]  M. Shadlen,et al.  Decision Making as a Window on Cognition , 2013, Neuron.

[64]  S. Killcross,et al.  Coordination of actions and habits in the medial prefrontal cortex of rats. , 2003, Cerebral cortex.

[65]  Adam Kepecs,et al.  Midbrain Dopamine Neurons Signal Belief in Choice Accuracy during a Perceptual Decision , 2017, Current Biology.

[66]  M. Shadlen,et al.  Representation of Confidence Associated with a Decision by Neurons in the Parietal Cortex , 2009, Science.

[67]  S. Mizumori,et al.  Neurons in rat medial prefrontal cortex show anticipatory rate changes to predictable differential rewards in a spatial memory task , 2001, Behavioural Brain Research.

[68]  Timothy E. J. Behrens,et al.  Review Frontal Cortex and Reward-guided Learning and Decision-making Figure 1. Frontal Brain Regions in the Macaque Involved in Reward-guided Learning and Decision-making Finer Grained Anatomical Divisions with Frontal Cortical Systems for Reward-guided Behavior , 2022 .

[69]  S. Killcross,et al.  Inactivation of the prelimbic, but not infralimbic, prefrontal cortex impairs the contextual control of response conflict in rats , 2007, The European journal of neuroscience.

[70]  William R. Stauffer,et al.  Dopamine Neuron-Specific Optogenetic Stimulation in Rhesus Macaques , 2016, Cell.

[71]  Talia N. Lerner,et al.  Intact-Brain Analyses Reveal Distinct Information Carried by SNc Dopamine Subcircuits , 2015, Cell.

[72]  Adam Kepecs,et al.  A computational framework for the study of confidence in humans and animals , 2012, Philosophical Transactions of the Royal Society B: Biological Sciences.