Individual differences in reinforcement learning: Behavioral, electrophysiological, and neuroimaging correlates

During reinforcement learning, phasic modulations of activity in midbrain dopamine neurons are conveyed to the dorsal anterior cingulate cortex (dACC) and basal ganglia (BG) and serve to guide adaptive responding. While the animal literature supports a role for the dACC in integrating reward history over time, most human electrophysiological studies of dACC function have focused on responses to single positive and negative outcomes. The present electrophysiological study investigated the role of the dACC in probabilistic reward learning in healthy subjects using a task that required integration of reinforcement history over time. We recorded the feedback-related negativity (FRN) to reward feedback in subjects who developed a response bias toward a more frequently rewarded ("rich") stimulus ("learners") versus subjects who did not ("non-learners"). Compared to non-learners, learners showed more positive (i.e., smaller) FRNs and greater dACC activation upon receiving reward for correct identification of the rich stimulus. In addition, dACC activation and a bias to select the rich stimulus were positively correlated. The same participants also completed a monetary incentive delay (MID) task administered during functional magnetic resonance imaging. Compared to non-learners, learners displayed stronger BG responses to reward in the MID task. These findings raise the possibility that learners in the probabilistic reinforcement task were characterized by stronger dACC and BG responses to rewarding outcomes. Furthermore, these results highlight the importance of the dACC to probabilistic reward learning in humans.

[1]  P. S. Achilles THE PSYCHOLOGICAL CORPORATION. , 1923, Science.

[2]  Jeffrey C. Cooper,et al.  Functional magnetic resonance imaging of reward prediction , 2005, Current opinion in neurology.

[3]  S. Inati,et al.  An fMRI study of reward-related probability learning , 2005, NeuroImage.

[4]  W. Schultz,et al.  Learning-Related Human Brain Activations Reflecting Individual Finances , 2007, Neuron.

[5]  P. Glimcher,et al.  Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[6]  J. C. Crowley,et al.  Saccade Reward Signals in Posterior Cingulate Cortex , 2003, Neuron.

[7]  W. Schultz Behavioral dopamine signals , 2007, Trends in Neurosciences.

[8]  Clay B. Holroyd,et al.  Reward prediction error signals associated with a modified time estimation task. , 2007, Psychophysiology.

[9]  Jobu Watanabe,et al.  Context-dependent cortical activation in response to financial reward and penalty: an event-related fMRI study , 2003, NeuroImage.

[10]  Thomas E. Nichols,et al.  Optimization of experimental design in fMRI: a general framework using a genetic algorithm , 2003, NeuroImage.

[11]  Michael J. Frank,et al.  Dynamic Dopamine Modulation in the Basal Ganglia: A Neurocomputational Account of Cognitive Deficits in Medicated and Nonmedicated Parkinsonism , 2005, Journal of Cognitive Neuroscience.

[12]  T. Koenig,et al.  Low resolution brain electromagnetic tomography (LORETA) functional imaging in acute, neuroleptic-naive, first-episode, productive schizophrenia , 1999, Psychiatry Research: Neuroimaging.

[13]  A. Dale,et al.  Whole Brain Segmentation Automated Labeling of Neuroanatomical Structures in the Human Brain , 2002, Neuron.

[14]  Petra E. Pajtas,et al.  Single dose of a dopamine agonist impairs reinforcement learning in humans: Behavioral evidence from a laboratory-based measure of reward responsiveness , 2008, Psychopharmacology.

[15]  Joshua W. Brown,et al.  Performance Monitoring by the Anterior Cingulate Cortex During Saccade Countermanding , 2003, Science.

[16]  N. Daw,et al.  Reinforcement Learning Signals in the Human Striatum Distinguish Learners from Nonlearners during Reward-Based Decision Making , 2007, The Journal of Neuroscience.

[17]  P. Dayan,et al.  Differential Encoding of Losses and Gains in the Human Striatum , 2007, The Journal of Neuroscience.

[18]  J. Cacioppo,et al.  Handbook Of Psychophysiology , 2019 .

[19]  M. Fava,et al.  Reduced hedonic capacity in major depressive disorder: evidence from a probabilistic reward task. , 2008, Journal of psychiatric research.

[20]  T. Münte,et al.  Learning by doing: an fMRI study of feedback-related brain activations , 2007, Neuroreport.

[21]  M. Hautus Corrections for extreme proportions and their biasing effects on estimated values ofd′ , 1995 .

[22]  A. Dale,et al.  Dorsal anterior cingulate cortex: A role in reward-based decision making , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Elena Goetz,et al.  Euthymic Patients with Bipolar Disorder Show Decreased Reward Learning in a Probabilistic Reward Task , 2008, Biological Psychiatry.

[24]  M. Reuter,et al.  Genetically Determined Differences in Learning from Errors , 2007, Science.

[25]  Hisao Nishijo,et al.  Single neuron responses in the monkey anterior cingulate cortex during visual discrimination , 1997, Neuroscience Letters.

[26]  M. Delgado,et al.  Reward‐Related Responses in the Human Striatum , 2007, Annals of the New York Academy of Sciences.

[27]  Clay B. Holroyd,et al.  It's worse than you thought: the feedback negativity and violations of reward prediction in gambling tasks. , 2007, Psychophysiology.

[28]  B. Vogt Pain and emotion interactions in subregions of the cingulate gyrus , 2005, Nature Reviews Neuroscience.

[29]  W. Schultz Getting Formal with Dopamine and Reward , 2002, Neuron.

[30]  J. Tanji,et al.  Role for cingulate motor area cells in voluntary movement selection based on reward. , 1998, Science.

[31]  Timothy E. J. Behrens,et al.  Optimal decision making and the anterior cingulate cortex , 2006, Nature Neuroscience.

[32]  C. Olson,et al.  Functional heterogeneity in cingulate cortex: the anterior executive and posterior evaluative regions. , 1992, Cerebral cortex.

[33]  P. Garris,et al.  Dissociation of dopamine release in the nucleus accumbens from intracranial self-stimulation , 1999, Nature.

[34]  T. Ono,et al.  Neural correlates to action and rewards in the rat posterior cingulate cortex , 2005, Neuroreport.

[35]  Clay B. Holroyd,et al.  The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity. , 2002, Psychological review.

[36]  R A McCormick,et al.  Testing a tripartite model: I. Evaluating the convergent and discriminant validity of anxiety and depression symptom scales. , 1995, Journal of abnormal psychology.

[37]  R. Andrew Chambers,et al.  Network Modeling of Adult Neurogenesis: Shifting Rates of Neuronal Turnover Optimally Gears Network Learning according to Novelty Gradient , 2007, Journal of Cognitive Neuroscience.

[38]  Christopher S. Monk,et al.  Choice selection and reward anticipation: an fMRI study , 2004, Neuropsychologia.

[39]  S Makeig,et al.  Blind separation of auditory event-related brain responses into independent components. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[40]  David Goodman,et al.  Performance Monitoring in the Anterior Cingulate is Not All Error Related: Expectancy Deviation and the Representation of Action-Outcome Associations , 2007, Journal of Cognitive Neuroscience.

[41]  D. Pizzagalli Electroencephalography and High-Density Electrophysiological Source Localization , 2007 .

[42]  D. Pizzagalli,et al.  Acute Stress Reduces Reward Responsiveness: Implications for Depression , 2006, Biological Psychiatry.

[43]  Cameron S. Carter,et al.  Errors without conflict: Implications for performance monitoring theories of anterior cingulate cortex , 2004, Brain and Cognition.

[44]  R Turner,et al.  Optimized EPI for fMRI studies of the orbitofrontal cortex , 2003, NeuroImage.

[45]  Adrian R. Willoughby,et al.  The Medial Frontal Cortex and the Rapid Processing of Monetary Gains and Losses , 2002, Science.

[46]  Clay B. Holroyd,et al.  Brain potentials associated with expected and unexpected good and bad outcomes. , 2005, Psychophysiology.

[47]  B. Alsop,et al.  Sensitivity to reward frequency in boys with attention deficit hyperactivity disorder. , 1999, Journal of clinical child psychology.

[48]  Nikos Makris,et al.  Automatically parcellating the human cerebral cortex. , 2004, Cerebral cortex.

[49]  A. Rodríguez-Fornells,et al.  Brain potentials related to self-generated and external information used for performance monitoring , 2005, Clinical Neurophysiology.

[50]  Clay B. Holroyd,et al.  Dorsal anterior cingulate cortex integrates reinforcement history to guide voluntary behavior , 2008, Cortex.

[51]  Dietrich Lehmann,et al.  Affective Judgments of Faces Modulate Early Activity (∼160 ms) within the Fusiform Gyri , 2002, NeuroImage.

[52]  Timothy E. J. Behrens,et al.  Functional organization of the medial frontal cortex , 2007, Current Opinion in Neurobiology.

[53]  L. J. Chapman,et al.  The measurement of handedness , 1987, Brain and Cognition.

[54]  C. Braun,et al.  Event-Related Brain Potentials Following Incorrect Feedback in a Time-Estimation Task: Evidence for a Generic Neural System for Error Detection , 1997, Journal of Cognitive Neuroscience.

[55]  Michael J. Frank,et al.  Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning , 2007, Proceedings of the National Academy of Sciences.

[56]  A. Holmes,et al.  Dissociation of neural regions associated with anticipatory versus consummatory phases of incentive processing. , 2007, Psychophysiology.

[57]  E. Procyk,et al.  Reward encoding in the monkey anterior cingulate cortex. , 2006, Cerebral cortex.

[58]  Brian Knutson,et al.  A region of mesial prefrontal cortex tracks monetarily rewarding outcomes: characterization with rapid event-related fMRI , 2003, NeuroImage.

[59]  Karl J. Friston,et al.  Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning , 2004, Science.

[60]  Jonathan D. Cohen,et al.  Computational roles for dopamine in behavioural control , 2004, Nature.

[61]  Clay B. Holroyd,et al.  Knowing good from bad: differential activation of human cortical areas by positive and negative outcomes , 2005, The European journal of neuroscience.

[62]  D. Pizzagalli,et al.  Toward an objective characterization of an anhedonic phenotype: A signal-detection approach , 2005, Biological Psychiatry.