Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control

A broad range of neural and behavioral data suggests that the brain contains multiple systems for behavioral choice, including one associated with prefrontal cortex and another with dorsolateral striatum. However, such a surfeit of control raises an additional choice problem: how to arbitrate between the systems when they disagree. Here, we consider dual-action choice systems from a normative perspective, using the computational theory of reinforcement learning. We identify a key trade-off pitting computational simplicity against the flexible and statistically efficient use of experience. The trade-off is realized in a competition between the dorsolateral striatal and prefrontal systems. We suggest a Bayesian principle of arbitration between them according to uncertainty, so each controller is deployed when it should be most accurate. This provides a unifying account of a wealth of experimental evidence about the factors favoring dominance by either system.

[1]  David Elkind,et al.  Learning: An Introduction , 1968 .

[2]  Christopher D. Adams Variations in the Sensitivity of Instrumental Responding to Reinforcer Devaluation , 1982 .

[3]  K. Johnson An Update. , 1984, Journal of food protection.

[4]  A. Dickinson Actions and habits: the development of behavioural autonomy , 1985 .

[5]  R. Rescorla,et al.  Instrumental responding remains sensitive to reinforcer devaluation after extensive training , 1985 .

[6]  G. E. Alexander,et al.  Parallel organization of functionally segregated circuits linking basal ganglia and cortex. , 1986, Annual review of neuroscience.

[7]  Joel L. Davis,et al.  A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .

[8]  S P Wise,et al.  Distributed modular architectures linking basal ganglia, cerebellum, and cerebral cortex: their role in planning and controlling action. , 1995, Cerebral cortex.

[9]  Joel L. Davis,et al.  In : Models of Information Processing in the Basal Ganglia , 2008 .

[10]  B. Balleine,et al.  Motivational control of heterogeneous instrumental chains. , 1995 .

[11]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[12]  A. Owen Cognitive planning in humans: Neuropsychological, neuroanatomical and neuropharmacological perspectives , 1997, Progress in Neurobiology.

[13]  Eric B. Baum,et al.  A Bayesian Approach to Relevance in Game Playing , 1997, Artif. Intell..

[14]  B. Balleine,et al.  Goal-directed instrumental action: contingency and incentive learning and their cortical substrates , 1998, Neuropharmacology.

[15]  Stuart J. Russell,et al.  Bayesian Q-Learning , 1998, AAAI/IAAI.

[16]  P. Holland Amount of training affects associatively-activated event representation , 1998, Neuropharmacology.

[17]  David Andre,et al.  Model based Bayesian Exploration , 1999, UAI.

[18]  Kenji Doya,et al.  What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? , 1999, Neural Networks.

[19]  C. I. Connolly,et al.  Building neural representations of habits. , 1999, Science.

[20]  Malcolm J. A. Strens,et al.  A Bayesian Framework for Reinforcement Learning , 2000, ICML.

[21]  B. Balleine,et al.  The Effect of Lesions of the Insular Cortex on Instrumental Conditioning: Evidence for a Role in Incentive Memory , 2000, The Journal of Neuroscience.

[22]  D. Schacter,et al.  A social cognitive neuroscience approach to emotion and memory. , 2000 .

[23]  K. Doya,et al.  Parallel Cortico-Basal Ganglia Mechanisms for Acquisition and Execution of Visuomotor SequencesA Computational Approach , 2001, Journal of Cognitive Neuroscience.

[24]  R. Suri Anticipatory responses of dopamine neurons and cortical neurons reproduced by internal model , 2001, Experimental Brain Research.

[25]  B. Knowlton,et al.  Learning and memory functions of the Basal Ganglia. , 2002, Annual review of neuroscience.

[26]  D. Kahneman,et al.  Representativeness revisited: Attribute substitution in intuitive judgment. , 2002 .

[27]  S. Killcross,et al.  3. Associative representations of emotionally significant outcomes , 2002 .

[28]  P. Dayan,et al.  Reward, Motivation, and Reinforcement Learning , 2002, Neuron.

[29]  Clay B. Holroyd,et al.  The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity. , 2002, Psychological review.

[30]  D. Kahneman,et al.  Heuristics and Biases: The Psychology of Intuitive Judgment , 2002 .

[31]  David S. Touretzky,et al.  Timing and Partial Observability in the Dopamine System , 2002, NIPS.

[32]  M. Oaksford,et al.  Emotional cognition: from brain to behaviour , 2002 .

[33]  Shie Mannor,et al.  Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.

[34]  G. Hall,et al.  Preserved Sensitivity to Outcome Value after Lesions of the Basolateral Amygdala , 2003, The Journal of Neuroscience.

[35]  R. Zemel,et al.  Inference and computation with population codes. , 2003, Annual review of neuroscience.

[36]  S. Killcross,et al.  Inactivation of the infralimbic prefrontal cortex reinstates goal-directed responding in overtrained rats , 2003, Behavioural Brain Research.

[37]  Matthew D. Lieberman,et al.  Reflexive and reflective judgment processes: A social cognitive neuroscience approach. , 2003 .

[38]  S. Killcross,et al.  Coordination of actions and habits in the medial prefrontal cortex of rats. , 2003, Cerebral cortex.

[39]  D. Stapel Social judgments: Implicit and explicit processes. , 2003 .

[40]  P. Holland Relations between Pavlovian-instrumental transfer and reinforcer devaluation. , 2004, Journal of experimental psychology. Animal behavior processes.

[41]  G. Loewenstein,et al.  Animal Spirits: Affective and Deliberative Processes in Economic Behavior , 2004 .

[42]  John N. Tsitsiklis,et al.  Bias and variance in value function estimation , 2004, ICML.

[43]  Saori C. Tanaka,et al.  Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops , 2004, Nature Neuroscience.

[44]  Karl J. Friston,et al.  Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning , 2004, Science.

[45]  Samuel M. McClure,et al.  Separate Neural Systems Value Immediate and Delayed Monetary Rewards , 2004, Science.

[46]  Jonathan D. Cohen,et al.  Conflict monitoring and anterior cingulate cortex: an update , 2004, Trends in Cognitive Sciences.

[47]  D. Kahneman,et al.  Attribute Substitution in Intuitive judgment. , 2004 .

[48]  E. Murray,et al.  Bilateral Orbital Prefrontal Cortex Lesions in Rhesus Monkeys Disrupt Choices Guided by Both Reward Value and Reward Contingency , 2004, The Journal of Neuroscience.

[49]  B. Balleine,et al.  Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning , 2004, The European journal of neuroscience.

[50]  P. Holland,et al.  Amygdala–frontal interactions and reward expectancy , 2004, Current Opinion in Neurobiology.

[51]  Alexandre Pouget,et al.  Bayesian multisensory integration and cross-modal spatial links , 2004, Journal of Physiology-Paris.

[52]  E. Miller,et al.  Different time courses of learning-related activity in the prefrontal cortex and striatum , 2005, Nature.

[53]  Angela J. Yu,et al.  Uncertainty, Neuromodulation, and Attention , 2005, Neuron.

[54]  Suzanna Becker,et al.  A Computational Model of the Functional Role of the Ventral-Striatal D2 Receptor in the Expression of Previously Acquired Behaviors , 2005, Neural Computation.

[55]  A. Faure,et al.  Lesion to the Nigrostriatal Dopamine System Disrupts Stimulus-Response Habit Formation , 2005, The Journal of Neuroscience.

[56]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[57]  B. Everitt,et al.  Acquisition of Instrumental Conditioned Reinforcement is Resistant to the Devaluation of the Unconditioned Stimulus , 2005, The Quarterly journal of experimental psychology. B, Comparative and physiological psychology.

[58]  B. Balleine,et al.  The role of the dorsomedial striatum in instrumental conditioning , 2005, The European journal of neuroscience.

[59]  N. Burgess,et al.  Complementary memory systems: competition, cooperation and compensation , 2005, Trends in Neurosciences.

[60]  Ricardo Chavarriaga,et al.  A Computational Model of Parallel Navigation Systems in Rodents , 2005 .

[61]  R. K. Simpson Nature Neuroscience , 2022 .