Explicit knowledge of task structure is the primary determinant of human model-based action

Explicit information obtained through instruction profoundly shapes human choice behaviour. However, this has been studied in computationally simple tasks, and it is unknown how model-based and model-free systems, respectively generating goal-directed and habitual actions, are affected by the absence or presence of instructions. We assessed behaviour in a novel variant of a computationally more complex decision-making task, before and after providing information about task structure, both in healthy volunteers and individuals suffering from obsessive-compulsive (OCD) or other disorders. Initial behaviour was model-free, with rewards directly reinforcing preceding actions. Model-based control, employing predictions of states resulting from each action, emerged with experience in a minority of subjects, and less in OCD. Providing task structure information strongly increased model-based control, similarly across all groups. Thus, explicit task structural knowledge determines human use of model-based reinforcement learning, and is most readily acquired from instruction rather than experience .

[1]  N. Daw,et al.  Linear reinforcement learning in planning, grid fields, and cognitive control , 2021, Nature Communications.

[2]  Peter Dayan,et al.  The Anterior Cingulate Cortex Predicts Future States to Mediate Model-Based Action Selection , 2020, Neuron.

[3]  A. Soltani,et al.  Learning arbitrary stimulus-reward associations for naturalistic stimuli involves transition from learning about features to learning about objects , 2020, Cognition.

[4]  Anne G. E. Collins,et al.  Beyond dichotomies in reinforcement learning , 2020, Nature Reviews Neuroscience.

[5]  Todd A. Hare,et al.  Humans primarily use model-based inference in the two-stage task , 2020, Nature Human Behaviour.

[6]  P. Dayan,et al.  Anterior cingulate cortex represents action-state predictions and causally mediates model-based reinforcement learning in a two-step decision task , 2020, bioRxiv.

[7]  Arkady Konovalov,et al.  Mouse tracking reveals structure knowledge in the absence of model-based choice , 2020, Nature Communications.

[8]  Sang Wan Lee,et al.  Task complexity interacts with state-space uncertainty in the arbitration between model-based and model-free learning , 2019, Nature Communications.

[9]  Christina L. Boisseau,et al.  Comparison of the Association Between Goal-Directed Planning and Self-reported Compulsivity vs Obsessive-Compulsive Disorder Diagnosis , 2019, JAMA psychiatry.

[10]  Samuel J. Gershman,et al.  Believing in dopamine , 2019, Nature Reviews Neuroscience.

[11]  Michael Moutoussis,et al.  Credit assignment to state-independent task representations and its relationship with model-based decision making , 2019, Proceedings of the National Academy of Sciences.

[12]  Todd A. Hare,et al.  Model-free or muddled models in the two-stage task? , 2019 .

[13]  C. Gillan,et al.  Does cognitive-behavioral therapy affect goal-directed planning in obsessive-compulsive disorder? , 2019, Psychiatry Research.

[14]  A. Oliveira-Maia,et al.  Criterion Validity of the Yale-Brown Obsessive-Compulsive Scale Second Edition for Diagnosis of Obsessive-Compulsive Disorder in Adults , 2018, Front. Psychiatry.

[15]  Chi-Hsiang Chung,et al.  Electroconvulsive Therapy and Risk of Dementia—A Nationwide Cohort Study in Taiwan , 2018, Front. Psychiatry.

[16]  Andreea C. Bostan,et al.  The basal ganglia and the cerebellum: nodes in an integrated network , 2018, Nature Reviews Neuroscience.

[17]  Daeyeol Lee,et al.  Feature-based learning improves adaptability without compromising precision , 2017, Nature Communications.

[18]  P. Dayan,et al.  Single-Trial Inhibition of Anterior Cingulate Disrupts Model-based Reinforcement Learning in a Two-step Decision Task. , 2017 .

[19]  M. Bloch,et al.  Obsessive-Compulsive Disorder: Advances in Diagnosis and Treatment , 2017, JAMA.

[20]  Samuel Gershman,et al.  Predictive representations can link model-based reinforcement learning to model-free mechanisms , 2017, bioRxiv.

[21]  R. Costa,et al.  Habits , 2014 .

[22]  Nicolas W. Schuck,et al.  Human Orbitofrontal Cortex Represents a Cognitive Map of State Space , 2016, Neuron.

[23]  Wouter Kool,et al.  When Does Model-Based Control Pay Off? , 2016, PLoS Comput. Biol..

[24]  M. Botvinick,et al.  Reduced model-based decision-making in schizophrenia. , 2016, Journal of abnormal psychology.

[25]  N. Daw,et al.  Instructed knowledge shapes feedback-driven aversive learning in striatum and orbitofrontal cortex, but not the amygdala , 2016, eLife.

[26]  N. Daw,et al.  Characterizing a psychiatric symptom dimension related to deficits in goal-directed control , 2016, eLife.

[27]  N. Daw,et al.  Motivation and value influences in the relative balance of goal-directed and habitual behaviours in obsessive-compulsive disorder , 2015, Translational Psychiatry.

[28]  Zeb Kurth-Nelson,et al.  Model-Based Reasoning in Humans Becomes Automatic with Training , 2015, PLoS Comput. Biol..

[29]  Samuel J. Gershman,et al.  Computational rationality: A converging paradigm for intelligence in brains, minds, and machines , 2015, Science.

[30]  Peter Dayan,et al.  Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task , 2015, bioRxiv.

[31]  N. Daw,et al.  Valence-dependent influence of serotonin depletion on model-based choice strategy , 2015, Molecular Psychiatry.

[32]  A. Villringer,et al.  The interaction of acute and chronic stress impairs model-based behavioral control , 2015, Psychoneuroendocrinology.

[33]  R. Dolan,et al.  Ventral striatal dopamine reflects behavioral and neural signatures of model-based control during sequential decision making , 2015, Proceedings of the National Academy of Sciences.

[34]  P. Dayan,et al.  Disorders of compulsivity: a common bias towards learning habits , 2014, Molecular Psychiatry.

[35]  Miriam Sebold,et al.  Processing speed enhances model-based over model-free reinforcement learning in the presence of high working memory functioning , 2014, Front. Psychol..

[36]  L. Deserno,et al.  Model-Based and Model-Free Decisions in Alcohol Dependence , 2014, Neuropsychobiology.

[37]  L. Deserno,et al.  Devaluation and sequential decisions: linking goal-directed and model-based behavior , 2014, Front. Hum. Neurosci..

[38]  Shane T. Mueller,et al.  The Psychology Experiment Building Language (PEBL) and PEBL Test Battery , 2014, Journal of Neuroscience Methods.

[39]  Shinsuke Shimojo,et al.  Neural Computations Underlying Arbitration between Model-Based and Model-free Learning , 2013, Neuron.

[40]  Shu-Chen Li,et al.  Of goals and habits: age-related and individual differences in goal-directed decision-making , 2013, Front. Neurosci..

[41]  Alice Y. Chiang,et al.  Working-memory capacity protects model-based learning from stress , 2013, Proceedings of the National Academy of Sciences.

[42]  Thomas H. B. FitzGerald,et al.  Disruption of Dorsolateral Prefrontal Cortex Decreases Model-Based in Favor of Model-free Control in Humans , 2013, Neuron.

[43]  P. Dayan,et al.  Goals and Habits in the Brain , 2013, Neuron.

[44]  N. Daw,et al.  Extraversion differentiates between model-based and model-free strategies in a reinforcement learning task , 2013, Front. Hum. Neurosci..

[45]  A. Markman,et al.  The Curse of Planning: Dissecting Multiple Reinforcement-Learning Systems by Taxing the Central Executive , 2013 .

[46]  Charles D. Spielberger,et al.  State-Trait Anxiety Inventory for Adults , 2012 .

[47]  P. Dayan,et al.  Mapping value based planning and extensively trained choice in the human brain , 2012, Nature Neuroscience.

[48]  A. Beck,et al.  Beck Depression Inventory–II , 2011 .

[49]  T. Robbins,et al.  Disruption in the Balance Between Goal-Directed Behavior and Habit Learning in Obsessive-Compulsive Disorder , 2011, The American journal of psychiatry.

[50]  Hauke R. Heekeren,et al.  The Neural Basis of Following Advice , 2011, PLoS biology.

[51]  Raymond J. Dolan,et al.  Disentangling the Roles of Approach, Activation and Valence in Instrumental and Pavlovian Responding , 2011, PLoS Comput. Biol..

[52]  P. Dayan,et al.  Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.

[53]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[54]  M. Delgado,et al.  How instructed knowledge modulates the neural systems of reward learning , 2010, Proceedings of the National Academy of Sciences.

[55]  E. Storch,et al.  Development and psychometric evaluation of the Yale-Brown Obsessive-Compulsive Scale--Second Edition. , 2010, Psychological assessment.

[56]  P. Dayan,et al.  States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning , 2010, Neuron.

[57]  R. Hertwig,et al.  The description–experience gap in risky choice , 2009, Trends in Cognitive Sciences.

[58]  M. Frank,et al.  Instructional control of reinforcement learning: A behavioral and neurocomputational investigation , 2009, Brain Research.

[59]  Richard Gonzalez,et al.  Computational Models for the Combination of Advice and Individual Learning , 2009, Cogn. Sci..

[60]  E. Bullmore,et al.  Integrating evidence from neuroimaging and neuropsychological studies of obsessive-compulsive disorder: The orbitofronto-striatal model revisited , 2008, Neuroscience & Biobehavioral Reviews.

[61]  T. Robbins,et al.  Orbitofrontal Dysfunction in Patients with Obsessive-Compulsive Disorder and Their Unaffected Relatives , 2008, Science.

[62]  Timothy Edward John Behrens,et al.  Contrasting roles for cingulate and orbitofrontal cortex in decisions and social behaviour , 2007, Trends in Cognitive Sciences.

[63]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[64]  D. Kahneman A perspective on judgment and choice: mapping bounded rationality. , 2003, The American psychologist.

[65]  Ann M Graybiel,et al.  Toward a Neurobiology of Obsessive-Compulsive Disorder , 2000, Neuron.

[66]  D. Berch,et al.  The Corsi Block-Tapping Task: Methodological and Theoretical Considerations , 1998, Brain and Cognition.

[67]  G. Dunbar,et al.  The validity of the Mini International Neuropsychiatric Interview (MINI) according to the SCID-P and its reliability , 1997, European Psychiatry.

[68]  S. Sloman The empirical case for two systems of reasoning. , 1996 .

[69]  P. Lovibond,et al.  Manual for the Depression Anxiety Stress Scales. 2 , 1995 .

[70]  M. Steinberg Structured clinical interview for DSM-IV dissociative disorders (SCID-D). , 1993 .

[71]  W. Goodman,et al.  The Yale-Brown Obsessive Compulsive Scale. I. Development, use, and reliability. , 1989, Archives of general psychiatry.

[72]  J. Mazziotta,et al.  Local cerebral glucose metabolic rates in obsessive-compulsive disorder. A comparison with rates in unipolar depression and in normal controls. , 1987, Archives of general psychiatry.

[73]  A. Dickinson Actions and habits: the development of behavioural autonomy , 1985 .

[74]  R. Rescorla,et al.  Postconditioning devaluation of a reinforcer affects instrumental responding. , 1985 .

[75]  C. Spielberger Manual for the State-Trait Anxiety Inventory (STAI) (Form Y , 1983 .

[76]  Mark Galizio,et al.  Instructional control of human operant behavior. , 1983 .

[77]  Christopher D. Adams Variations in the Sensitivity of Instrumental Responding to Reinforcer Devaluation , 1982 .

[78]  Christopher D. Adams,et al.  Instrumental Responding following Reinforcer Devaluation , 1981 .

[79]  C. Spielberger,et al.  Manual for the State-Trait Anxiety Inventory , 1970 .

[80]  A. Baron,et al.  Effects of instructions and reinforcement-feedback on human operant behavior maintained by fixed-interval reinforcement. , 1969, Journal of the experimental analysis of behavior.

[81]  G. Wilson Reversal of differential GSR conditioning by instructions. , 1968, Journal of experimental psychology.

[82]  Arnold Kaufman,et al.  SOME EFFECTS OF INSTRUCTIONS ON HUMAN OPERANT BEHAVIOR. , 1966 .

[83]  E. Thorndike “Animal Intelligence” , 1898, Nature.