Tonic dopamine: opportunity costs and the control of response vigor

RationaleDopamine neurotransmission has long been known to exert a powerful influence over the vigor, strength, or rate of responding. However, there exists no clear understanding of the computational foundation for this effect; predominant accounts of dopamine’s computational function focus on a role for phasic dopamine in controlling the discrete selection between different actions and have nothing to say about response vigor or indeed the free-operant tasks in which it is typically measured.ObjectivesWe seek to accommodate free-operant behavioral tasks within the realm of models of optimal control and thereby capture how dopaminergic and motivational manipulations affect response vigor.MethodsWe construct an average reward reinforcement learning model in which subjects choose both which action to perform and also the latency with which to perform it. Optimal control balances the costs of acting quickly against the benefits of getting reward earlier and thereby chooses a best response latency.ResultsIn this framework, the long-run average rate of reward plays a key role as an opportunity cost and mediates motivational influences on rates and vigor of responding. We review evidence suggesting that the average reward rate is reported by tonic levels of dopamine putatively in the nucleus accumbens.ConclusionsOur extension of reinforcement learning models to free-operant tasks unites psychologically and computationally inspired ideas about the role of tonic dopamine in striatum, explaining from a normative point of view why higher levels of dopamine might be associated with more vigorous responding.

[1]  R J HERRNSTEIN,et al.  Relative and absolute strength of response as a function of frequency of reinforcement. , 1961, Journal of the experimental analysis of behavior.

[2]  R. Bolles Theory of Motivation , 1967 .

[3]  S. Ochs Integrative Activity of the Brain: An Interdisciplinary Approach , 1968 .

[4]  G. S. Reynolds,et al.  A quantitative analysis of the responding maintained by interval schedules of reinforcement. , 1968, Journal of the experimental analysis of behavior.

[5]  B. Pitt Psychopharmacology , 1968, Mental Health.

[6]  R. Herrnstein On the law of effect. , 1970, Journal of the experimental analysis of behavior.

[7]  G E Zuriff A comparison of variable-ratio and variable-interval schedules of reinforcement. , 1970, Journal of the experimental analysis of behavior.

[8]  R. Solomon,et al.  An opponent-process theory of motivation. I. Temporal dynamics of affect. , 1974, Psychological review.

[9]  C. Gallistel,et al.  Parametric analysis of brain stimulation reward in the rat: I. The transient process and the memory-containing process. , 1974, Journal of comparative and physiological psychology.

[10]  G. Ainslie Specious reward: a behavioral theory of impulsiveness and impulse control. , 1975, Psychological bulletin.

[11]  T. J. Matthews,et al.  Yoked variable-ratio and variable-interval responding in pigeons. , 1977, Journal of the experimental analysis of behavior.

[12]  J. Gibbon Scalar expectancy theory and Weber's law in animal timing. , 1977 .

[13]  R. Solomon,et al.  An Opponent-Process Theory of Motivation , 1978 .

[14]  J. Barrett,et al.  Effects of ethanol on multiple fixed-interval fixed-ratio schedule performances: dynamic interactions at different fixed-ratio values. , 1980, Journal of the experimental analysis of behavior.

[15]  A G Barto,et al.  Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.

[16]  W. Essman,et al.  Current Developments in Psychopharmacology , 2012, Current Developments in Psychopharmacology.

[17]  T. Robbins,et al.  Functional studies of the central catecholamines. , 1982, International review of neurobiology.

[18]  R. Beninger The role of dopamine in locomotor activity and learning , 1983, Brain Research Reviews.

[19]  J. E. Mazur,et al.  Steady-state performance on fixed-, mixed-, and random-ratio schedules. , 1983, Journal of the experimental analysis of behavior.

[20]  A. Dickinson Actions and habits: the development of behavioural autonomy , 1985 .

[21]  R. Oades The role of noradrenaline in tuning and dopamine in switching between signals in the CNS , 1985, Neuroscience & Biobehavioral Reviews.

[22]  M. Le Moal,et al.  Behavioral study after local injection of 6-hydroxydopamine into the nucleus accumbens in the rat , 1985, Brain Research.

[23]  N. White,et al.  Effects of systemic and intracranial amphetamine injections on behavior in the open field: A detailed analysis , 1987, Pharmacology Biochemistry and Behavior.

[24]  J. Desce,et al.  Respective contributions of neuronal activity and presynaptic mechanisms in the control of the in vivo release of dopamine. , 1990, Journal of neural transmission. Supplementum.

[25]  I. Weiner Neural substrates of latent inhibition: the switching model. , 1990, Psychological bulletin.

[26]  Richard S. Sutton,et al.  Time-Derivative Models of Pavlovian Reinforcement , 1990 .

[27]  M. Gabriel,et al.  Learning and Computational Neuroscience: Foundations of Adaptive Networks , 1990 .

[28]  A. Dickinson,et al.  Performance on Ratio and Interval Schedules with Matched Reinforcement Rates , 1990, The Quarterly journal of experimental psychology. B, Comparative and physiological psychology.

[29]  M. Chesselet Presynaptic Regulation of Dopamine Release Implications for the Functional Organization of the Basal Ganglia , 1990, Annals of the New York Academy of Sciences.

[30]  M. Le Moal,et al.  Mesocorticolimbic dopaminergic network: functional and regulatory roles. , 1991, Physiological reviews.

[31]  A. Grace Phasic versus tonic dopamine release and the modulation of dopamine system responsivity: A hypothesis for the etiology of schizophrenia , 1991, Neuroscience.

[32]  A. Cools,et al.  Evidence that dopamine in the nucleus accumbens is involved in the ability of rats to switch to cue-directed behaviours , 1991, Behavioural Brain Research.

[33]  W. Schultz,et al.  Responses of monkey dopamine neurons during learning of behavioral reactions. , 1992, Journal of neurophysiology.

[34]  W. Schultz,et al.  Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[35]  Anton Schwartz,et al.  A Reinforcement Learning Method for Maximizing Undiscounted Rewards , 1993, ICML.

[36]  B. Balleine,et al.  Motivational control of goal-directed action , 1994 .

[37]  Joel L. Davis,et al.  A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .

[38]  Karl J. Friston,et al.  Value-dependent selection in the brain: Simulation in a synthetic neural model , 1994, Neuroscience.

[39]  O. Hikosaka Models of information processing in the basal Ganglia edited by James C. Houk, Joel L. Davis and David G. Beiser, The MIT Press, 1995. $60.00 (400 pp) ISBN 0 262 08234 9 , 1995, Trends in Neurosciences.

[40]  P. Killeen Economics, ecologics, and mechanics: The dynamics of responding under conditions of varying motivation. , 1995, Journal of the experimental analysis of behavior.

[41]  A. Barto Adaptive Critics and the Basal Ganglia , 1995 .

[42]  H. Fibiger,et al.  Dopaminergic correlates of motivated behavior: importance of drive , 1995, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[43]  J. Wickens,et al.  Cellular models of reinforcement. , 1995 .

[44]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[45]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[46]  J. Salamone,et al.  Nucleus accumbens dopamine depletions alter relative response allocation in a T-maze cost/benefit task , 1996, Behavioural Brain Research.

[47]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[48]  A. Kacelnik Normative and descriptive models of decision making: time discounting and risk sensitivity. , 2007, Ciba Foundation symposium.

[49]  M. Foster,et al.  Open versus closed economies: performance of domestic hens under fixed ratio schedules. , 1997, Journal of the experimental analysis of behavior.

[50]  J. Salamone,et al.  The Role of Accumbens Dopamine in Lever Pressing and Response Allocation: Effects of 6-OHDA Injected into Core and Dorsomedial Shell , 1998, Pharmacology Biochemistry and Behavior.

[51]  K. Berridge,et al.  What is the role of dopamine in reward: hedonic impact, reward learning, or incentive salience? , 1998, Brain Research Reviews.

[52]  J. Salamone,et al.  Nucleus accumbens dopamine depletions make rats more sensitive to high ratio requirements but do not impair primary food reinforcement , 1999, Neuroscience.

[53]  Holly Moore,et al.  The regulation of forebrain dopamine transmission: relevance to the pathophysiology and psychopathology of schizophrenia , 1999, Biological Psychiatry.

[54]  P. Redgrave,et al.  The basal ganglia: a vertebrate solution to the selection problem? , 1999, Neuroscience.

[55]  J. Staddon,et al.  The dynamics of operant conditioning. , 1999, Psychological review.

[56]  P. Fletcher,et al.  Activation of 5-HT1B receptors in the nucleus accumbens reduces amphetamine-induced enhancement of responding for conditioned reward , 1999, Psychopharmacology.

[57]  S. Ikemoto,et al.  The role of nucleus accumbens dopamine in motivated behavior: a unifying interpretation with special reference to reward-seeking , 1999, Brain Research Reviews.

[58]  J. Mirenowicz,et al.  Dissociation of Pavlovian and instrumental incentive learning under dopamine antagonists. , 2000, Behavioral neuroscience.

[59]  J. Mirenowicz,et al.  Dissociation of Pavlovian and instrumental incentive learning under dopamine antagonists. , 2000, Behavioral neuroscience.

[60]  C. Gallistel,et al.  Time, rate, and conditioning. , 2000, Psychological review.

[61]  J. Salamone,et al.  Nucleus accumbens dopamine depletions make animals highly sensitive to high fixed ratio requirements but do not impair primary food reinforcement , 2001, Neuroscience.

[62]  O. Hikosaka,et al.  Modulation of saccadic eye movements by predicted reward outcome , 2001, Experimental Brain Research.

[63]  W. Schultz,et al.  Dopamine responses comply with basic assumptions of formal learning theory , 2001, Nature.

[64]  Wolfram Schultz,et al.  Behavioral reactions reflecting differential reward expectations in monkeys , 2001, Experimental Brain Research.

[65]  M. Kemel,et al.  Presynaptic Regulation of Dopamine Release , 2002 .

[66]  B. Knowlton,et al.  Learning and memory functions of the Basal Ganglia. , 2002, Annual review of neuroscience.

[67]  Sham M. Kakade,et al.  Opponent interactions between serotonin and dopamine , 2002, Neural Networks.

[68]  J. Salamone,et al.  Nucleus accumbens dopamine and work requirements on interval schedules , 2002, Behavioural Brain Research.

[69]  G. Chiara Dopamine in the CNS I , 2002, Handbook of Experimental Pharmacology.

[70]  H. Pashler STEVENS' HANDBOOK OF EXPERIMENTAL PSYCHOLOGY , 2002 .

[71]  D. Joel,et al.  Dopamine in Schizophrenia Dysfunctional Information Processing in Basal Ganglia — Thalamocortical Split Circuits , 2002 .

[72]  J. Salamone,et al.  Motivational views of reinforcement: implications for understanding the behavioral functions of nucleus accumbens dopamine , 2002, Behavioural Brain Research.

[73]  David S. Touretzky,et al.  Long-Term Reward Prediction in TD Models of the Dopamine System , 2002, Neural Computation.

[74]  B. Balleine,et al.  The Role of Learning in the Operation of Motivational Systems , 2002 .

[75]  Geoffrey Schoenbaum,et al.  Lesions of orbitofrontal cortex and basolateral amygdala complex disrupt acquisition of odor-guided discriminations and reversals. , 2003, Learning & memory.

[76]  Samuel M. McClure,et al.  A computational substrate for incentive salience , 2003, Trends in Neurosciences.

[77]  R. Wightman,et al.  Subsecond dopamine release promotes cocaine seeking , 2003, Nature.

[78]  W. Schultz,et al.  Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons , 2003, Science.

[79]  A. Grace,et al.  Afferent modulation of dopamine neuron firing differentially regulates tonic and phasic dopamine transmission , 2003, Nature Neuroscience.

[80]  Tatsuo K Sato,et al.  Correlated Coding of Motivation and Outcome of Decision by Dopamine Neurons , 2003, The Journal of Neuroscience.

[81]  N. Daw,et al.  Reinforcement learning models of the dopamine system and their behavioral implications , 2003 .

[82]  P. Garris,et al.  ‘Passive stabilization’ of striatal extracellular dopamine across the lesion spectrum encompassing the presymptomatic phase of Parkinson's disease: a voltammetric study in the 6‐OHDA‐lesioned rat , 2003, Journal of neurochemistry.

[83]  R. Wise Dopamine, learning and motivation , 2004, Nature Reviews Neuroscience.

[84]  N. Andén,et al.  A functional effect of dopamine in the nucleus accumbens and in some other dopamine-rich parts of the rat brain , 1975, Psychopharmacologia.

[85]  K. Berridge Motivation concepts in behavioral neuroscience , 2004, Physiology & Behavior.

[86]  T. Robbins,et al.  Enhanced behavioural control by conditioned reinforcers following microinjections of d-amphetamine into the nucleus accumbens , 2004, Psychopharmacology.

[87]  R Mark Wightman,et al.  Extrasynaptic dopamine and phasic neuronal activity , 2004, Nature Neuroscience.

[88]  T. W. Robbins,et al.  Increased response switching, perseveration and perseverative switching following d-amphetamine in the rat , 2004, Psychopharmacology.

[89]  R. Wightman,et al.  Dopamine Operates as a Subsecond Modulator of Food Seeking , 2004, The Journal of Neuroscience.

[90]  B. Balleine,et al.  Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning , 2004, The European journal of neuroscience.

[91]  T. Ljungberg,et al.  Disruptive effects of low doses of d-amphetamine on the ability of rats to organize behaviour into functional sequences , 2004, Psychopharmacology.

[92]  M. Walton,et al.  Differential involvement of serotonin and dopamine systems in cost-benefit decisions about delay or effort , 2005, Psychopharmacology.

[93]  T. Robbins,et al.  6-Hydroxydopamine lesions of the nucleus accumbens, but not of the caudate nucleus, attenuate enhanced responding with reward-related stimuli produced by intra-accumbens d-amphetamine , 2004, Psychopharmacology.

[94]  A. Grace,et al.  Dopaminergic modulation of limbic and cortical drive of nucleus accumbens in goal-directed behavior , 2005, Nature Neuroscience.

[95]  J. Salamone,et al.  Ratio and time requirements on operant schedules: effort‐related effects of nucleus accumbens dopamine depletions , 2005, The European journal of neuroscience.

[96]  W. Schultz,et al.  Adaptive Coding of Reward Value by Dopamine Neurons , 2005, Science.

[97]  A. Faure,et al.  Lesion to the Nigrostriatal Dopamine System Disrupts Stimulus-Response Habit Formation , 2005, The Journal of Neuroscience.

[98]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[99]  J. Wickens,et al.  Striatal dopamine in motor activation and reward-mediated learning: steps towards a unifying model , 2005, Journal of Neural Transmission / General Section JNT.

[100]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[101]  SRIDHAR MAHADEVAN,et al.  Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results , 2005, Machine Learning.

[102]  Peter Dayan,et al.  How fast to work: Response vigor, motivation and tonic dopamine , 2005, NIPS.

[103]  P. Dayan,et al.  Motivational effects on behavior: Towards a reinforcement learning model of rates of responding , 2005 .

[104]  Anthony A Grace,et al.  The Hippocampus Modulates Dopamine Neuron Responsivity by Regulating the Intensity of Phasic Neuron Activation , 2006, Neuropsychopharmacology.

[105]  Matthew F. S. Rushworth,et al.  Weighing up the benefits of work: Behavioral and neural analyses of effort-related decision making , 2006, Neural Networks.

[106]  P. Dayan,et al.  Cortical substrates for exploratory decisions in humans , 2006, Nature.

[107]  A. Grace,et al.  The laterodorsal tegmentum is essential for burst firing of ventral tegmental area dopamine neurons. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[108]  W. Hauber,et al.  Inactivation of the ventral tegmental area abolished the general excitatory influence of Pavlovian cues on instrumental performance. , 2006, Learning & memory.

[109]  P. Dayan,et al.  Opinion TRENDS in Cognitive Sciences Vol.10 No.8 Full text provided by www.sciencedirect.com A normative perspective on motivation , 2022 .

[110]  P. Shizgal,et al.  Prolonged rewarding stimulation of the rat medial forebrain bundle: neurochemical and behavioral consequences. , 2006, Behavioral neuroscience.

[111]  J. Crotts Why Choose this Book? How we make decisions , 2008 .

[112]  K. Campbell,et al.  A neural correlate of response bias in monkey caudate nucleus , 2022 .