Model-Based Reasoning in Humans Becomes Automatic with Training

Model-based and model-free reinforcement learning (RL) have been suggested as algorithmic realizations of goal-directed and habitual action strategies. Model-based RL is more flexible than model-free but requires sophisticated calculations using a learnt model of the world. This has led model-based RL to be identified with slow, deliberative processing, and model-free RL with fast, automatic processing. In support of this distinction, it has recently been shown that model-based reasoning is impaired by placing subjects under cognitive load—a hallmark of non-automaticity. Here, using the same task, we show that cognitive load does not impair model-based reasoning if subjects receive prior training on the task. This finding is replicated across two studies and a variety of analysis methods. Thus, task familiarity permits use of model-based reasoning in parallel with other cognitive demands. The ability to deploy model-based reasoning in an automatic, parallelizable fashion has widespread theoretical implications, particularly for the learning and execution of complex behaviors. It also suggests a range of important failure modes in psychiatric disorders.

[1]  D. Norman,et al.  Attention to Action: Willed and Automatic Control of Behavior Technical Report No. 8006. , 1980 .

[2]  Christopher D. Adams,et al.  Instrumental Responding following Reinforcer Devaluation , 1981 .

[3]  Christopher D. Adams,et al.  The Effect of the Instrumental Training Contingency on Susceptibility to Reinforcer Devaluation , 1983 .

[4]  Donald A. Norman,et al.  Attention to Action , 1986 .

[5]  R. Knight,et al.  Role of human prefrontal cortex in attention control. , 1995, Advances in neurology.

[6]  A. Owen Cognitive planning in humans: Neuropsychological, neuroanatomical and neuropharmacological perspectives , 1997, Progress in Neurobiology.

[7]  B. Balleine,et al.  Goal-directed instrumental action: contingency and incentive learning and their cortical substrates , 1998, Neuropharmacology.

[8]  S. Rauch,et al.  The counting stroop: An interference task specialized for functional neuroimaging—validation study with functional MRI , 1998, Human brain mapping.

[9]  J. Jonides,et al.  Storage and executive processes in the frontal lobes. , 1999, Science.

[10]  R. Poldrack Imaging Brain Plasticity: Conceptual and Methodological Issues— A Theoretical Review , 2000, NeuroImage.

[11]  F. Ashby,et al.  The effects of concurrent task interference on category learning: Evidence for multiple category learning systems , 2001, Psychonomic bulletin & review.

[12]  M. H Beauchamp,et al.  Dynamic functional changes associated with cognitive skill learning of an adapted version of the Tower of London task , 2003, NeuroImage.

[13]  A. Kelly,et al.  Human functional neuroimaging of brain changes associated with practice. , 2005, Cerebral cortex.

[14]  T. Robbins,et al.  Neural systems of reinforcement for drug addiction: from actions to habits to compulsion , 2005, Nature Neuroscience.

[15]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[16]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17]  Timothy E. J. Behrens,et al.  Optimal decision making and the anterior cingulate cortex , 2006, Nature Neuroscience.

[18]  J. Alvarez,et al.  Executive Function and the Frontal Lobes: A Meta-Analytic Review , 2006, Neuropsychology Review.

[19]  Karl J. Friston,et al.  A free energy principle for the brain , 2006, Journal of Physiology-Paris.

[20]  Christopher L. Asplund,et al.  Isolation of a Central Bottleneck of Information Processing with Time-Resolved fMRI , 2006, Neuron.

[21]  Vivian V. Valentin,et al.  Determining the Neural Substrates of Goal-Directed Learning in the Human Brain , 2007, The Journal of Neuroscience.

[22]  Timothy E. J. Behrens,et al.  Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[23]  Peter Dayan,et al.  The role of value systems in decision making. , 2008 .

[24]  Karl J. Friston Hierarchical Models in the Brain , 2008, PLoS Comput. Biol..

[25]  David Badre,et al.  Cognitive control, hierarchy, and the rostro–caudal organization of the frontal lobes , 2008, Trends in Cognitive Sciences.

[26]  M. Sigman,et al.  Brain Mechanisms of Serial and Parallel Processing during Dual-Task Performance , 2008, The Journal of Neuroscience.

[27]  Hatim A. Zariwala,et al.  Neural correlates, computation and behavioural impact of decision confidence , 2008, Nature.

[28]  P. Dayan,et al.  Reinforcement learning: The Good, The Bad and The Ugly , 2008, Current Opinion in Neurobiology.

[29]  W. Singer,et al.  Better than conscious? : decision making, the human mind, and implications for institutions , 2008 .

[30]  B. Balleine,et al.  A specific role for posterior dorsolateral striatum in human habit learning , 2009, The European journal of neuroscience.

[31]  P. Dayan,et al.  States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning , 2010, Neuron.

[32]  B. Balleine,et al.  Human and Rodent Homologies in Action Control: Corticostriatal Determinants of Goal-Directed and Habitual Action , 2010, Neuropsychopharmacology.

[33]  Raymond J. Dolan,et al.  Disentangling the Roles of Approach, Activation and Valence in Instrumental and Pavlovian Responding , 2011, PLoS Comput. Biol..

[34]  P. Dayan,et al.  Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.

[35]  Amir Dezfouli,et al.  Speed/Accuracy Trade-Off between the Habitual and the Goal-Directed Processes , 2011, PLoS Comput. Biol..

[36]  P. Dayan,et al.  Mapping value based planning and extensively trained choice in the human brain , 2012, Nature Neuroscience.

[37]  R. Dolan,et al.  Dopamine Enhances Model-Based over Model-Free Choice Behavior , 2012, Neuron.

[38]  N. Daw,et al.  Extraversion differentiates between model-based and model-free strategies in a reinforcement learning task , 2013, Front. Hum. Neurosci..

[39]  Alice Y. Chiang,et al.  Working-memory capacity protects model-based learning from stress , 2013, Proceedings of the National Academy of Sciences.

[40]  A. Markman,et al.  The Curse of Planning: Dissecting Multiple Reinforcement-Learning Systems by Taxing the Central Executive , 2013 .

[41]  J. Grafman,et al.  Dorsolateral prefrontal contributions to human working memory , 2013, Cortex.

[42]  P. Dayan,et al.  Goals and Habits in the Brain , 2013, Neuron.

[43]  Thomas H. B. FitzGerald,et al.  Disruption of Dorsolateral Prefrontal Cortex Decreases Model-Based in Favor of Model-free Control in Humans , 2013, Neuron.

[44]  R. Dolan,et al.  Confidence in value-based choice , 2012, Nature Neuroscience.

[45]  Michael L. Waskom,et al.  Frontoparietal Representations of Task Context Support the Flexible Control of Goal-Directed Cognition , 2014, The Journal of Neuroscience.

[46]  A. Markman,et al.  Journal of Experimental Psychology : General Retrospective Revaluation in Sequential Decision Making : A Tale of Two Systems , 2012 .

[47]  Ali Yildiz,et al.  Parallel and serial processing in dual-tasking differentially involves mechanisms in the striatum and the lateral prefrontal cortex , 2014, Brain Structure and Function.

[48]  Thomas H. B. FitzGerald,et al.  Transcranial Direct Current Stimulation of Right Dorsolateral Prefrontal Cortex Does Not Affect Model-Based or Model-Free Reinforcement Learning in Humans , 2014, PloS one.

[49]  P. Dayan,et al.  Disorders of compulsivity: a common bias towards learning habits , 2014, Molecular Psychiatry.