Motivational effects on behavior: Towards a reinforcement learning model of rates of responding

There is something rotten in the state of current reinforcement learning models of conditioning. Most experiments report measures such as the rate of responding given the state of the subject and the configuration of the environment. There is even further sub-structure in the temporal organisation of responses, which includes effects associated with the expected timing of forthcoming rewards[1]. However, current theoretical treatments are largely restricted to choices between discrete actions, and are silent about rates or any other aspect of action at a temporal fine scale. This lacuna is brought into sharp relief by current work on the effects of motivation on reinforcement learning, since 'energizing', one of two main motivational effects, is exactly concerned with modulations in the vigour of prepotent responses, and thus changes in rates. Furthermore, dopamine, the golden-haired neurochemical of reinforcement learning, has been directly implicated in these energizing motivational effects[2]. Here, we use data collected from experiments into both energizing, and motivation's other main effect ('directing' behavior, by altering the structure of subjects' goals) to investigate the influence of motivation on free-operant instrumental behavior in detail, in order to propose a reinforcement learning model of rates of responding.

[1]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[2]  Tatsuo K Sato,et al.  Correlated Coding of Motivation and Outcome of Decision by Dopamine Neurons , 2003, The Journal of Neuroscience.