Deliberation and Procedural Automation on a Two-Step Task for Rats

Current theories suggest that decision-making arises from multiple, competing action-selection systems. Rodent studies dissociate deliberation and procedural behavior, and find a transition from procedural to deliberative behavior with experience. However, it remains unknown how this transition from deliberative to procedural control evolves within single trials, or within blocks of repeated choices. We adapted for rats a two-step task which has been used to dissociate model-based from model-free decisions in humans. We found that amixture ofmodel-based andmodel-free algorithms was more likely to explain rat choice strategies on the task than either model-based or model-free algorithms alone. This task contained two choices per trial, which provides a more complex and non-discrete per-trial choice structure. This task structure enabled us to evaluate how deliberative and procedural behavior evolved within-trial and within blocks of repeated choice sequences. We found that vicarious trial and error (VTE), a behavioral correlate of deliberation in rodents, was correlated between the two choice points on a given lap. We also found that behavioral stereotypy, a correlate of procedural automation, increased with the number of repeated choices. While VTE at the first choice point decreased with the number of repeated choices, VTE at the second choice point did not, and only increased after unexpected transitions within the task. This suggests that deliberation at the beginning of trialsmay correspond to changes in choice patterns, while mid-trial deliberation may correspond to an interruption of a procedural process.

[1]  Kyle S. Smith,et al.  A Dual Operator View of Habitual Behavior Reflecting Cortical and Striatal Dynamics , 2013, Neuron.

[2]  A. Redish Beyond the Cognitive Map: From Place Cells to Episodic Memory , 1999 .

[3]  Karl F. Muenzinger,et al.  Tone discrimination in white rats. , 1931 .

[4]  N. Daw,et al.  Extraversion differentiates between model-based and model-free strategies in a reinforcement learning task , 2013, Front. Hum. Neurosci..

[5]  L. Swanson Cerebral hemisphere regulation of motivated behavior 1 1 Published on the World Wide Web on 2 November 2000. , 2000, Brain Research.

[6]  J. G. Taylor,et al.  Vicarious trial and error. , 1951, Psychological review.

[7]  P. Dayan,et al.  Opinion TRENDS in Cognitive Sciences Vol.10 No.8 Full text provided by www.sciencedirect.com A normative perspective on motivation , 2022 .

[8]  Peter Dayan,et al.  Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task , 2015, bioRxiv.

[9]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[10]  P. Dayan,et al.  Reward, Motivation, and Reinforcement Learning , 2002, Neuron.

[11]  Nathaniel D. Daw,et al.  Cognitive Control Predicts Use of Model-based Reinforcement Learning , 2014, Journal of Cognitive Neuroscience.

[12]  Bernard W. Balleine,et al.  Actions, Action Sequences and Habits: Evidence That Goal-Directed and Habitual Action Control Are Hierarchically Organized , 2013, PLoS Comput. Biol..

[13]  A. Redish,et al.  Development of path stereotypy in a single day in rats on a multiple-T maze. , 2002, Archives italiennes de biologie.

[14]  E. Tolman Prediction of vicarious trial and error by means of the schematic sowbug. , 1939 .

[15]  T. Robbins,et al.  Counterfactual Processing of Economic Action-Outcome Alternatives in Obsessive-Compulsive Disorder: Further Evidence of Impaired Goal-Directed Behavior , 2014, Biological Psychiatry.

[16]  Valerie A. Carr,et al.  Prospective representation of navigational goals in the human hippocampus , 2016, Science.

[17]  A. Redish,et al.  Hippocampus and subregions of the dorsal striatum respond differently to a behavioral strategy change on a spatial navigation task. , 2015, Journal of neurophysiology.

[18]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[19]  Adam Johnson,et al.  Triple Dissociation of Information Processing in Dorsal Striatum, Ventral Striatum, and Hippocampus on a Learned Spatial Decision Task , 2010, Neuron.

[20]  Amir Dezfouli,et al.  Habits as action sequences: hierarchical action control and changes in outcome value , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.

[21]  G. Loewenstein,et al.  Animal Spirits: Affective and Deliberative Processes in Economic Behavior , 2004 .

[22]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[23]  L. Deserno,et al.  Model-Based and Model-Free Decisions in Alcohol Dependence , 2014, Neuropsychobiology.

[24]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[25]  Andrew M. Wikenheiser,et al.  Hippocampal theta sequences reflect current goals , 2015, Nature Neuroscience.

[26]  Miriam Sebold,et al.  Processing speed enhances model-based over model-free reinforcement learning in the presence of high working memory functioning , 2014, Front. Psychol..

[27]  A. Redish,et al.  The Mind within the Brain: How We Make Decisions and How those Decisions Go Wrong , 2013 .

[28]  M. Frank Computational models of motivated action selection in corticostriatal circuits , 2011, Current Opinion in Neurobiology.

[29]  David J. Foster,et al.  Hippocampal theta sequences , 2007, Hippocampus.

[30]  Robert S. Gardner,et al.  A secondary working memory challenge preserves primary place strategies despite overtraining. , 2013, Learning & memory.

[31]  A. Villringer,et al.  The interaction of acute and chronic stress impairs model-based behavioral control , 2015, Psychoneuroendocrinology.

[32]  R. Dolan,et al.  Dopamine Enhances Model-Based over Model-Free Choice Behavior , 2012, Neuron.

[33]  P. Dayan,et al.  The algorithmic anatomy of model-based evaluation , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.

[34]  Matthijs A. A. van der Meer,et al.  Information Processing in Decision-Making Systems , 2012, The Neuroscientist : a review journal bringing neurobiology, neurology and psychiatry.

[35]  Paolo Calabresi,et al.  Dopamine-mediated regulation of corticostriatal synaptic plasticity , 2007, Trends in Neurosciences.

[36]  R. Dolan,et al.  Ventral striatal dopamine reflects behavioral and neural signatures of model-based control during sequential decision making , 2015, Proceedings of the National Academy of Sciences.

[37]  M. Schlossberg Information Processing in Animals: Memory Mechanisms. , 1986 .

[38]  P. Dayan,et al.  States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning , 2010, Neuron.

[39]  Alice Y. Chiang,et al.  Working-memory capacity protects model-based learning from stress , 2013, Proceedings of the National Academy of Sciences.

[40]  J. D. McGaugh,et al.  Inactivation of Hippocampus or Caudate Nucleus with Lidocaine Differentially Affects Expression of Place and Response Learning , 1996, Neurobiology of Learning and Memory.

[41]  Shu-Chen Li,et al.  Of goals and habits: age-related and individual differences in goal-directed decision-making , 2013, Front. Neurosci..

[42]  Adam Johnson,et al.  Neural Ensembles in CA3 Transiently Encode Paths Forward of the Animal at a Decision Point , 2007, The Journal of Neuroscience.

[43]  John K. Kruschke,et al.  Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan , 2014 .

[44]  S. Sloman The empirical case for two systems of reasoning. , 1996 .

[45]  R. Passingham The hippocampus as a cognitive map J. O'Keefe & L. Nadel, Oxford University Press, Oxford (1978). 570 pp., £25.00 , 1979, Neuroscience.

[46]  Amir Dezfouli,et al.  Speed/Accuracy Trade-Off between the Habitual and the Goal-Directed Processes , 2011, PLoS Comput. Biol..

[47]  T. Robbins,et al.  Disruption in the Balance Between Goal-Directed Behavior and Habit Learning in Obsessive-Compulsive Disorder , 2011, The American journal of psychiatry.

[48]  N. Daw,et al.  Model-based learning protects against forming habits , 2015, Cognitive, Affective, & Behavioral Neuroscience.

[49]  N. Daw,et al.  The ubiquity of model-based reinforcement learning , 2012, Current Opinion in Neurobiology.

[50]  B. Balleine,et al.  Habits, action sequences and reinforcement learning , 2012, The European journal of neuroscience.

[51]  Dylan A. Simon,et al.  Model-based choices involve prospective neural activity , 2015, Nature Neuroscience.

[52]  Catherine A. Hartley,et al.  From Creatures of Habit to Goal-Directed Learners , 2016, Psychological science.

[53]  P. Dayan,et al.  Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.

[54]  Dylan A. Simon,et al.  Neural Correlates of Forward Planning in a Spatial Decision Task in Humans , 2011, The Journal of Neuroscience.

[55]  David S. Touretzky,et al.  Context Learning in the Rodent Hippocampus , 2007, Neural Computation.

[56]  Y. Niv,et al.  Discovering latent causes in reinforcement learning , 2015, Current Opinion in Behavioral Sciences.

[57]  B. Balleine,et al.  Multiple Forms of Value Learning and the Function of Dopamine , 2009 .

[58]  N. Daw,et al.  Dopamine selectively remediates 'model-based' reward learning: a computational approach. , 2016, Brain : a journal of neurology.

[59]  P. Dayan,et al.  Goals and Habits in the Brain , 2013, Neuron.

[60]  A. Markman,et al.  The Curse of Planning: Dissecting Multiple Reinforcement-Learning Systems by Taxing the Central Executive , 2013 .

[61]  Kevin J. Miller,et al.  Dorsal hippocampus contributes to model-based planning , 2017, Nature Neuroscience.

[62]  N. Daw,et al.  Variability in Dopamine Genes Dissociates Model-Based and Model-Free Reinforcement Learning , 2016, The Journal of Neuroscience.

[63]  Jadin C. Jackson,et al.  Reconciling reinforcement learning models with behavioral extinction and renewal: implications for addiction, relapse, and problem gambling. , 2007, Psychological review.

[64]  B. Knowlton,et al.  Contributions of striatal subregions to place and response learning. , 2004, Learning & memory.

[65]  C. I. Connolly,et al.  Building neural representations of habits. , 1999, Science.

[66]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[67]  P. Dayan,et al.  Disorders of compulsivity: a common bias towards learning habits , 2014, Molecular Psychiatry.

[68]  A David Redish,et al.  Conflict between place and response navigation strategies: effects on vicarious trial and error (VTE) behaviors. , 2013, Learning & memory.

[69]  Matthijs A. A. van der Meer,et al.  Integrating hippocampus and striatum in decision-making , 2007, Current Opinion in Neurobiology.