Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation.

The basal ganglia support learning to exploit decisions that have yielded positive outcomes in the past. In contrast, limited evidence implicates the prefrontal cortex in the process of making strategic exploratory decisions when the magnitude of potential outcomes is unknown. Here we examine neurogenetic contributions to individual differences in these distinct aspects of motivated human behavior, using a temporal decision-making task and computational analysis. We show that two genes controlling striatal dopamine function, DARPP-32 (also called PPP1R1B) and DRD2, are associated with exploitative learning to adjust response times incrementally as a function of positive and negative decision outcomes. In contrast, a gene primarily controlling prefrontal dopamine function (COMT) is associated with a particular type of 'directed exploration', in which exploratory decisions are made in proportion to Bayesian uncertainty about whether other choices might produce outcomes that are better than the status quo. Quantitative model fits reveal that genetic factors modulate independent parameters of a reinforcement learning system.

[1]  J. Gani,et al.  Progress in statistics , 1975 .

[2]  P. Greengard,et al.  DARPP-32, a dopamine- and adenosine 3':5'-monophosphate-regulated phosphoprotein enriched in dopamine-innervated brain regions. III. Immunocytochemical localization , 1984, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[3]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[4]  Peter Dayan,et al.  Exploration bonuses and dual control , 1996 .

[5]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[6]  D. Pfaff,et al.  Catechol-O-methyltransferase-deficient mice exhibit sexually dimorphic changes in catecholamine levels and behavior. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[7]  J. Hollerman,et al.  Dopamine neurons report an error in the temporal prediction of reward during learning , 1998, Nature Neuroscience.

[8]  Alexandre Pouget,et al.  Probabilistic Interpretation of Population Codes , 1996, Neural Computation.

[9]  R. Depue,et al.  Neurobiology of the structure of personality: Dopamine, facilitation of incentive motivation, and extraversion , 1999, Behavioral and Brain Sciences.

[10]  P. Greengard,et al.  Dopamine and cAMP-Regulated Phosphoprotein 32 kDa Controls Both Striatal Long-Term Depression and Long-Term Potentiation, Opposing Forms of Synaptic Plasticity , 2000, The Journal of Neuroscience.

[11]  I. Day,et al.  An efficient procedure for genotyping single nucleotide polymorphisms. , 2001, Nucleic acids research.

[12]  Achim G. Hoffmann,et al.  Proceedings of the Nineteenth International Conference on Machine Learning , 2002 .

[13]  Peter Dayan,et al.  Dopamine: generalization and bonuses , 2002, Neural Networks.

[14]  S. Grossberg,et al.  Psychological Review , 2003 .

[15]  Tatsuo K Sato,et al.  Correlated Coding of Motivation and Outcome of Decision by Dopamine Neurons , 2003, The Journal of Neuroscience.

[16]  Terrence J. Sejnowski,et al.  Exploration Bonuses and Dual Control , 1996, Machine Learning.

[17]  Karl J. Friston,et al.  Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning , 2004, Science.

[18]  Michael J. Frank,et al.  By Carrot or by Stick: Cognitive Reinforcement Learning in Parkinsonism , 2004, Science.

[19]  M. Roesch,et al.  Neuronal Activity Related to Reward Value and Motivation in Primate Frontal Cortex , 2004, Science.

[20]  K. Någren,et al.  C957T polymorphism of the dopamine D2 receptor (DRD2) gene affects striatal DRD2 availability in vivo , 2004, Molecular Psychiatry.

[21]  Michael J. Frank,et al.  Error-Related Negativity Predicts Reinforcement Learning and Conflict Biases , 2005, Neuron.

[22]  Colin Camerer,et al.  Neural Systems Responding to Degrees of Uncertainty in Human Decision-Making , 2005, Science.

[23]  R. Nussbaum,et al.  Midbrain dopamine and prefrontal function in humans: interaction and modulation by COMT genotype , 2005, Nature Neuroscience.

[24]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[25]  Michael J. Frank,et al.  Dynamic Dopamine Modulation in the Basal Ganglia: A Neurocomputational Account of Cognitive Deficits in Medicated and Nonmedicated Parkinsonism , 2005, Journal of Cognitive Neuroscience.

[26]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[27]  Trevor W. Robbins,et al.  Time-limited modulation of appetitive Pavlovian memory by D1 and NMDA receptors in the nucleus accumbens , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[28]  P. Glimcher,et al.  Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[29]  Wei Ji Ma,et al.  Bayesian inference with probabilistic population codes , 2006, Nature Neuroscience.

[30]  P. Dayan,et al.  Cortical substrates for exploratory decisions in humans , 2006, Nature.

[31]  M. Walton,et al.  Separate neural pathways process different decision costs , 2006, Nature Neuroscience.

[32]  Kae Nakamura,et al.  Role of Dopamine in the Primate Caudate Nucleus in Reward Modulation of Saccades , 2006, The Journal of Neuroscience.

[33]  S. Ishii,et al.  Resolution of Uncertainty in Prefrontal Cortex , 2006, Neuron.

[34]  M. Frank,et al.  Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. , 2006, Psychological review.

[35]  P. Dayan,et al.  Tonic dopamine: opportunity costs and the control of response vigor , 2007, Psychopharmacology.

[36]  Alan G Sanfey,et al.  Individual differences in decision making: Drive and reward responsiveness affect strategic bargaining in economic games , 2006, Behavioral and Brain Functions.

[37]  P. Glimcher,et al.  Statistics of midbrain dopamine neuron spike trains in the awake primate. , 2007, Journal of neurophysiology.

[38]  Thomas E. Hazy,et al.  PVLV: the primary value and learned value Pavlovian learning algorithm. , 2007, Behavioral neuroscience.

[39]  Andreas Meyer-Lindenberg,et al.  Genetic evidence implicating DARPP-32 in human frontostriatal structure, function, and cognition. , 2007, The Journal of clinical investigation.

[40]  Leonardo Fazio,et al.  Polymorphisms in human dopamine D2 receptor gene affect gene expression, splicing, and neuronal activity during working memory , 2007, Proceedings of the National Academy of Sciences.

[41]  M. Reuter,et al.  Genetically Determined Differences in Learning from Errors , 2007, Science.

[42]  Michael J. Frank,et al.  Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning , 2007, Proceedings of the National Academy of Sciences.

[43]  Angela J. Yu,et al.  Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[44]  Young T. Hong,et al.  Nucleus Accumbens D2/3 Receptors Predict Trait Impulsivity and Cocaine Reinforcement , 2007, Science.

[45]  B. Kolachana,et al.  COMT genotype predicts cortical-limbic D1 receptor availability measured with [11C]NNC112 and PET , 2008, Molecular Psychiatry.

[46]  A. Graybiel Habits, rituals, and the evaluative brain. , 2008, Annual review of neuroscience.

[47]  Michael X. Cohen,et al.  A Role for Dopamine in Temporal Decision Making and Reward Maximization in Parkinsonism , 2008, The Journal of Neuroscience.

[48]  Paul Greengard,et al.  A phosphatase cascade by which rewarding stimuli control nucleosomal response , 2008, Nature.

[49]  Trevor W. Robbins,et al.  High Impulsivity Predicts the Switch to Compulsive Cocaine-Taking , 2008, Science.

[50]  P. Greengard,et al.  Dichotomous Dopaminergic Control of Striatal Synaptic Plasticity , 2008, Science.

[51]  Thomas V. Wiecki,et al.  A neurocomputational account of catalepsy sensitization induced by D2 receptor blockade in rats: context dependency, extinction, and renewal , 2009, Psychopharmacology.

[52]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[53]  Michael J. Frank,et al.  Single dose of a dopamine agonist impairs reinforcement learning in humans: Evidence from event‐related potentials and computational modeling of striatal‐cortical function , 2009, Human brain mapping.

[54]  A. Hariri,et al.  Genetic variation in components of dopamine neurotransmission impacts ventral striatal reactivity associated with impulsivity , 2009, Molecular Psychiatry.

[55]  R. K. Simpson Nature Neuroscience , 2022 .