How fast to work: Response vigor, motivation and tonic dopamine

Reinforcement learning models have long promised to unify computational, psychological and neural accounts of appetitively conditioned behavior. However, the bulk of data on animal conditioning comes from free-operant experiments measuring how fast animals will work for reinforcement. Existing reinforcement learning (RL) models are silent about these tasks, because they lack any notion of vigor. They thus fail to address the simple observation that hungrier animals will work harder for food, as well as stranger facts such as their sometimes greater productivity even when working for irrelevant outcomes such as water. Here, we develop an RL framework for free-operant behavior, suggesting that subjects choose how vigorously to perform selected actions by optimally balancing the costs and benefits of quick responding. Motivational states such as hunger shift these factors, skewing the tradeoff. This accounts normatively for the effects of motivation on response rates, as well as many other classic findings. Finally, we suggest that tonic levels of dopamine may be involved in the computation linking motivational state to optimal responding, thereby explaining the complex vigor-related effects of pharmacological manipulation of dopamine.

[1]  C. L. Hull Principles of behavior : an introduction to behavior theory , 1943 .

[2]  D. Belanger,et al.  [Influence of an irrelevant drive on rat behavior and heart rate]. , 1961, Canadian journal of psychology.

[3]  R. Herrnstein On the law of effect. , 1970, Journal of the experimental analysis of behavior.

[4]  D. McFarland,et al.  Theory of motivation (Second edition)R. Bolles, Harper & Row, New York (1975). pp. 568 , 1976, Neuroscience.

[5]  D. Dewsbury,et al.  The Principles of Learning and Behavior. , 1982 .

[6]  M. Domjan The principles of learning and behavior , 1982 .

[7]  N. White,et al.  Effects of systemic and intracranial amphetamine injections on behavior in the open field: A detailed analysis , 1987, Pharmacology Biochemistry and Behavior.

[8]  A. Dickinson,et al.  Performance on Ratio and Interval Schedules with Matched Reinforcement Rates , 1990, The Quarterly journal of experimental psychology. B, Comparative and physiological psychology.

[9]  R. Boakes,et al.  Motivational control after extended instrumental training , 1995 .

[10]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[11]  M. Foster,et al.  Open versus closed economies: performance of domestic hens under fixed ratio schedules. , 1997, Journal of the experimental analysis of behavior.

[12]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[13]  J. Salamone,et al.  Nucleus accumbens dopamine depletions make rats more sensitive to high ratio requirements but do not impair primary food reinforcement , 1999, Neuroscience.

[14]  S. Ikemoto,et al.  The role of nucleus accumbens dopamine in motivated behavior: a unifying interpretation with special reference to reward-seeking , 1999, Brain Research Reviews.

[15]  C. Gallistel,et al.  Time, rate, and conditioning. , 2000, Psychological review.

[16]  Sham M. Kakade,et al.  Opponent interactions between serotonin and dopamine , 2002, Neural Networks.

[17]  P. Dayan,et al.  Reward, Motivation, and Reinforcement Learning , 2002, Neuron.

[18]  David S. Touretzky,et al.  Timing and Partial Observability in the Dopamine System , 2002, NIPS.

[19]  J. Salamone,et al.  Motivational views of reinforcement: implications for understanding the behavioral functions of nucleus accumbens dopamine , 2002, Behavioural Brain Research.

[20]  B. Balleine,et al.  The Role of Learning in the Operation of Motivational Systems , 2002 .

[21]  Samuel M. McClure,et al.  A computational substrate for incentive salience , 2003, Trends in Neurosciences.

[22]  Tatsuo K Sato,et al.  Correlated Coding of Motivation and Outcome of Decision by Dopamine Neurons , 2003, The Journal of Neuroscience.

[23]  N. Andén,et al.  A functional effect of dopamine in the nucleus accumbens and in some other dopamine-rich parts of the rat brain , 1975, Psychopharmacologia.

[24]  Sridhar Mahadevan,et al.  Average reward reinforcement learning: Foundations, algorithms, and empirical results , 2004, Machine Learning.

[25]  Yael Niv,et al.  Uncertainty-based competition between prefrontal and striatal systems for behavioural control , 2005 .

[26]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[27]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[28]  P. Dayan,et al.  Motivational effects on behavior: Towards a reinforcement learning model of rates of responding , 2005 .

[29]  U. Dieckmann,et al.  Adaptive Dynamics , 2020, Mathematical Population Genetics and Evolution of Bacterial Cooperation.