Single-Trial Inhibition of Anterior Cingulate Disrupts Model-based Reinforcement Learning in a Two-step Decision Task.

The anterior cingulate cortex (ACC) is implicated in learning the value of actions, and thus in allowing past outcomes to influence the current choice. However, it is not clear whether or how it contributes to the two major ways such learning is thought to happen: model-based mechanisms that learn action-state predictions and use these to infer action values; or model-free mechanisms which learn action values directly through reward prediction errors. Having confirmed, using a classical probabilistic reversal learning task, that optogenetic inhibition of ACC neurons on single trials indeed affected reinforcement learning, we examined the consequence of this manipulation in a novel two-step decision task designed to dissociate model-free and model-based learning mechanisms in mice. On the two-step task, silencing spared the influence of the trial outcome but reduced the influence of the experienced state transition. Analysis using reinforcement learning models indicated that ACC inhibition disrupted model-based RL mechanisms.

[1]  W. Brown Animal Intelligence: Experimental Studies , 1912, Nature.

[2]  Christopher D. Adams,et al.  Instrumental Responding following Reinforcer Devaluation , 1981 .

[3]  Christopher D. Adams,et al.  The Effect of the Instrumental Training Contingency on Susceptibility to Reinforcer Devaluation , 1983 .

[4]  A. Dickinson Actions and habits: the development of behavioural autonomy , 1985 .

[5]  R. Rescorla,et al.  Postconditioning devaluation of a reinforcer affects instrumental responding. , 1985 .

[6]  George Paxinos,et al.  The Mouse Brain in Stereotaxic Coordinates , 2001 .

[7]  B. Balleine,et al.  Goal-directed instrumental action: contingency and incentive learning and their cortical substrates , 1998, Neuropharmacology.

[8]  K. A. Hadland,et al.  The anterior cingulate and reward-guided selection of actions. , 2003, Journal of neurophysiology.

[9]  Matthew F S Rushworth,et al.  Functional Specialization within Medial Frontal Cortex of the Anterior Cingulate for Evaluating Effort-Related Decisions , 2003, The Journal of Neuroscience.

[10]  B. Balleine,et al.  The Effect of Lesions of the Basolateral Amygdala on Instrumental Conditioning , 2003, The Journal of Neuroscience.

[11]  Keiji Tanaka,et al.  Neuronal Correlates of Goal-Based Motor Selection in the Prefrontal Cortex , 2003, Science.

[12]  Joshua W. Brown,et al.  Performance Monitoring by the Anterior Cingulate Cortex During Saccade Countermanding , 2003, Science.

[13]  S. Killcross,et al.  Coordination of actions and habits in the medial prefrontal cortex of rats. , 2003, Cerebral cortex.

[14]  M. Walton,et al.  Action sets and decisions in the medial frontal cortex , 2004, Trends in Cognitive Sciences.

[15]  B. Balleine,et al.  Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning , 2004, The European journal of neuroscience.

[16]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[17]  B. Balleine,et al.  Lesions of Medial Prefrontal Cortex Disrupt the Acquisition But Not the Expression of Goal-Directed Learning , 2005, The Journal of Neuroscience.

[18]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[19]  B. Balleine,et al.  The role of the dorsomedial striatum in instrumental conditioning , 2005, The European journal of neuroscience.

[20]  B. Balleine,et al.  Blockade of NMDA receptors in the dorsomedial striatum prevents action–outcome learning in instrumental conditioning , 2005, The European journal of neuroscience.

[21]  Timothy E. J. Behrens,et al.  Optimal decision making and the anterior cingulate cortex , 2006, Nature Neuroscience.

[22]  M. Walton,et al.  Separate neural pathways process different decision costs , 2006, Nature Neuroscience.

[23]  C. Law,et al.  The relative influences of priors and sensory evidence on an oculomotor decision variable during perceptual learning. , 2008, Journal of neurophysiology.

[24]  Timothy E. J. Behrens,et al.  Frontal Cortex Subregions Play Distinct Roles in Choices between Actions and Stimuli , 2008, The Journal of Neuroscience.

[25]  Timothy E. J. Behrens,et al.  Choice, uncertainty and value in prefrontal and cingulate cortex , 2008, Nature Neuroscience.

[26]  G. Paxinos,et al.  Comprar The Mouse Brain in Stereotaxic Coordinates, The coronal plates and diagrams Compact, 3rd Edition | Keith Franklin | 9780123742445 | Academic Press , 2008 .

[27]  K. Doya,et al.  Validation of Decision-Making Models and Analysis of Decision Variables in the Rat Basal Ganglia , 2009, The Journal of Neuroscience.

[28]  Y. Niv,et al.  Learning latent structure: carving nature at its joints , 2010, Current Opinion in Neurobiology.

[29]  Jung Hoon Sul,et al.  Distinct Roles of Rodent Orbitofrontal and Medial Prefrontal Cortex in Decision Making , 2010, Neuron.

[30]  Raymond J. Dolan,et al.  Disentangling the Roles of Approach, Activation and Valence in Instrumental and Pavlovian Responding , 2011, PLoS Comput. Biol..

[31]  A. Gamal,et al.  Miniaturized integration of a fluorescence microscope , 2011, Nature Methods.

[32]  Timothy E. J. Behrens,et al.  Double dissociation of value computations in orbitofrontal and anterior cingulate neurons , 2011, Nature Neuroscience.

[33]  Dylan A. Simon,et al.  Neural Correlates of Forward Planning in a Spatial Decision Task in Humans , 2011, The Journal of Neuroscience.

[34]  P. Dayan,et al.  Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.

[35]  Amir Dezfouli,et al.  Speed/Accuracy Trade-Off between the Habitual and the Goal-Directed Processes , 2011, PLoS Comput. Biol..

[36]  Peter Dayan,et al.  Bonsai Trees in Your Head: How the Pavlovian System Sculpts Goal-Directed Choices by Pruning Decision Trees , 2012, PLoS Comput. Biol..

[37]  Brent A. Vogt,et al.  Cytoarchitecture of mouse and rat cingulate cortex with human homologies , 2012, Brain Structure and Function.

[38]  Mattias P. Karlsson,et al.  Network Resets in Medial Prefrontal Cortex Mark the Onset of Behavioral Uncertainty , 2012, Science.

[39]  Xin Jin,et al.  Different dorsal striatum circuits mediate action discrimination and action generalization , 2012, The European journal of neuroscience.

[40]  C. Padoa-Schioppa,et al.  Neuronal Encoding of Subjective Value in Dorsal and Ventral Anterior Cingulate Cortex , 2012, The Journal of Neuroscience.

[41]  R. Dolan,et al.  Dopamine Enhances Model-Based over Model-Free Choice Behavior , 2012, Neuron.

[42]  Rui Costa,et al.  Premotor cortex is critical for goal-directed actions , 2013, Front. Comput. Neurosci..

[43]  R. Costa,et al.  Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions , 2013, Nature Communications.

[44]  A. Markman,et al.  The Curse of Planning: Dissecting Multiple Reinforcement-Learning Systems by Taxing the Central Executive , 2013 .

[45]  Timothy E. J. Behrens,et al.  Dissociable effects of surprise and model update in parietal and anterior cingulate cortex , 2013, Proceedings of the National Academy of Sciences.

[46]  P. Dayan,et al.  Goals and Habits in the Brain , 2013, Neuron.

[47]  Thomas H. B. FitzGerald,et al.  Disruption of Dorsolateral Prefrontal Cortex Decreases Model-Based in Favor of Model-free Control in Humans , 2013, Neuron.

[48]  Jessica A. Cardin,et al.  Noninvasive optical inhibition with a red-shifted microbial rhodopsin , 2014, Nature Neuroscience.

[49]  L. Deserno,et al.  Model-Based and Model-Free Decisions in Alcohol Dependence , 2014, Neuropsychobiology.

[50]  Allan R. Jones,et al.  A mesoscale connectome of the mouse brain , 2014, Nature.

[51]  K. Sakai,et al.  Autonomous Mechanism of Internal Choice Estimate Underlies Decision Inertia , 2014, Neuron.

[52]  Shinsuke Shimojo,et al.  Neural Computations Underlying Arbitration between Model-Based and Model-free Learning , 2013, Neuron.

[53]  Dylan A. Simon,et al.  Model-based choices involve prospective neural activity , 2015, Nature Neuroscience.

[54]  Peter Dayan,et al.  Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task , 2015, bioRxiv.

[55]  Zeb Kurth-Nelson,et al.  Model-Based Reasoning in Humans Becomes Automatic with Training , 2015, PLoS Comput. Biol..

[56]  P. Dayan,et al.  Disorders of compulsivity: a common bias towards learning habits , 2014, Molecular Psychiatry.

[57]  K. Doya,et al.  Distinct Neural Representation in the Dorsolateral, Dorsomedial, and Ventral Parts of the Striatum during Fixed- and Free-Choice Tasks , 2015, The Journal of Neuroscience.

[58]  N. Daw,et al.  Cognitive Control Predicts Use of Model-based Reinforcement Learning , 2014, Journal of Cognitive Neuroscience.

[59]  Timothy Edward John Behrens,et al.  Value, search, persistence and model updating in anterior cingulate cortex , 2016, Nature Neuroscience.

[60]  N. Daw,et al.  Variability in Dopamine Genes Dissociates Model-Based and Model-Free Reinforcement Learning , 2016, The Journal of Neuroscience.

[61]  Wouter Kool,et al.  When Does Model-Based Control Pay Off? , 2016, PLoS Comput. Biol..

[62]  Jonathan D. Cohen,et al.  Dorsal anterior cingulate cortex and the value of control , 2016, Nature Neuroscience.

[63]  Matteo Carandini,et al.  Kilosort: realtime spike-sorting for extracellular electrophysiology with hundreds of channels , 2016, bioRxiv.

[64]  Nicholas N. Foster,et al.  The mouse cortico-striatal projectome , 2016, Nature Neuroscience.

[65]  N. Daw,et al.  Characterizing a psychiatric symptom dimension related to deficits in goal-directed control , 2016, eLife.

[66]  Kevin J. Miller,et al.  Dorsal hippocampus contributes to model-based planning , 2017, Nature Neuroscience.

[67]  Carlos D. Brody,et al.  Dorsal hippocampus plays a causal role in model-based planning , 2017 .

[68]  Samuel Gershman,et al.  Predictive representations can link model-based reinforcement learning to model-free mechanisms , 2017, bioRxiv.

[69]  Amir Dezfouli,et al.  Learning the structure of the world: The adaptive nature of state-space and action representations in multi-stage decision-making , 2017, bioRxiv.

[70]  A. David Redish,et al.  Deliberation and Procedural Automation on a Two-Step Task for Rats , 2018, Front. Integr. Neurosci..

[71]  Liam Paninski,et al.  Efficient and accurate extraction of in vivo calcium signals from microendoscopic video data , 2016, eLife.

[72]  Daeyeol Lee,et al.  Neurochemical and Behavioral Dissections of Decision-Making in a Rodent Multistage Task , 2018, The Journal of Neuroscience.

[73]  Kevin J. Miller,et al.  Habits without Values , 2016, bioRxiv.