Dopamine responses comply with basic assumptions of formal learning theory

According to contemporary learning theories, the discrepancy, or error, between the actual and predicted reward determines whether learning occurs when a stimulus is paired with a reward. The role of prediction errors is directly demonstrated by the observation that learning is blocked when the stimulus is paired with a fully predicted reward. By using this blocking procedure, we show that the responses of dopamine neurons to conditioned stimuli was governed differentially by the occurrence of reward prediction errors rather than stimulus–reward associations alone, as was the learning of behavioural reactions. Both behavioural and neuronal learning occurred predominantly when dopamine neurons registered a reward prediction error at the time of the reward. Our data indicate that the use of analytical tests derived from formal behavioural learning theory provides a powerful approach for studying the role of single neurons in learning.

[1]  R. E. Kalman,et al.  A New Approach to Linear Filtering and Prediction Problems , 2002 .

[2]  Marvin Minsky,et al.  Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[3]  D. Marr A theory of cerebellar cortex , 1969, The Journal of physiology.

[4]  W. F. Prokasy,et al.  Classical conditioning II: Current research and theory. , 1972 .

[5]  N. Mackintosh A Theory of Attention: Variations in the Associability of Stimuli with Reinforcement , 1975 .

[6]  J. Pearce,et al.  A model for Pavlovian learning: Variations in the effectiveness of conditioned but not of unconditioned stimuli. , 1980 .

[7]  A. Dickinson Contemporary Animal Learning Theory , 1981 .

[8]  A G Barto,et al.  Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.

[9]  N. Mackintosh,et al.  Conditioning And Associative Learning , 1983 .

[10]  Bernard Widrow,et al.  Adaptive Signal Processing , 1985 .

[11]  W. Schultz,et al.  Responses of nigrostriatal dopamine neurons to high-intensity somatosensory stimulation in the anesthetized monkey. , 1987, Journal of neurophysiology.

[12]  T. Sejnowski,et al.  Perspectives on cognitive neuroscience. , 1988, Science.

[13]  Bernard Widrow,et al.  Adaptive switching circuits , 1988 .

[14]  M. Ito,et al.  Long-term depression. , 1989, Annual review of neuroscience.

[15]  W. Schultz,et al.  Dopamine neurons of the monkey midbrain: contingencies of responses to active touch during self-initiated arm movements. , 1990, Journal of neurophysiology.

[16]  W. Schultz,et al.  Dopamine neurons of the monkey midbrain: contingencies of responses to stimuli eliciting immediate behavioral reactions. , 1990, Journal of neurophysiology.

[17]  M. Gabriel,et al.  Learning and Computational Neuroscience: Foundations of Adaptive Networks , 1990 .

[18]  I. Martin,et al.  Blocking Observed in Human Eyelid Conditioning , 1991, The Quarterly journal of experimental psychology. B, Comparative and physiological psychology.

[19]  R. Wise,et al.  Localization of drug reward mechanisms by intracranial injections , 1992, Synapse.

[20]  W. Schultz,et al.  Responses of monkey dopamine neurons during learning of behavioral reactions. , 1992, Journal of neurophysiology.

[21]  M. Kawato,et al.  The cerebellum and VOR/OKR learning models , 1992, Trends in Neurosciences.

[22]  P. Calabresi,et al.  Long-term synaptic depression in the striatum: physiological and pharmacological characterization , 1992, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[23]  S. Young,et al.  Presynaptic long‐term changes in excitability of the corticostriatal pathway , 1992, Neuroreport.

[24]  W. Schultz,et al.  Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[25]  Terry E. Robinson,et al.  Incentive-sensitization As The Basis Of Drug Craving , 1993 .

[26]  J. Salamone The involvement of nucleus accumbens dopamine in appetitive and aversive motivation , 1994, Behavioural Brain Research.

[27]  Karl J. Friston,et al.  Value-dependent selection in the brain: Simulation in a synthetic neural model , 1994, Neuroscience.

[28]  Joel L. Davis,et al.  Adaptive Critics and the Basal Ganglia , 1995 .

[29]  J. Wickens,et al.  Dopamine reverses the depression of rat corticostriatal synapses which normally follows high-frequency stimulation of cortex In vitro , 1996, Neuroscience.

[30]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[31]  T. Robbins,et al.  Neurobehavioural mechanisms of reward and motivation , 1996, Current Opinion in Neurobiology.

[32]  P. Calabresi,et al.  Abnormal Synaptic Plasticity in the Striatum of Mice Lacking Dopamine D2 Receptors , 1997, The Journal of Neuroscience.

[33]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[34]  P. Holland Brain mechanisms for changes in processing of conditioned stimuli in Pavlovian conditioning: Implications for behavior theory , 1997 .

[35]  J. Desce,et al.  Dopamine facilitates long-term depression of glutamatergic transmission in rat prefrontal cortex , 1998, Neuroscience.

[36]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[37]  R. F. Thompson,et al.  Inhibitory cerebello-olivary projections and blocking effect in classical conditioning. , 1998, Science.

[38]  J. Hollerman,et al.  Dopamine neurons report an error in the temporal prediction of reward during learning , 1998, Nature Neuroscience.

[39]  Richard S. Sutton,et al.  Dimensions of Reinforcement Learning , 1998 .

[40]  W. Schultz,et al.  A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task , 1999, Neuroscience.

[41]  J. Ayres,et al.  Blocked and overshadowed stimuli are weakened in their ability to serve as blockers and second-order reinforcers in Pavlovian fear conditioning. , 1999, Journal of experimental psychology. Animal behavior processes.

[42]  J. Desce,et al.  Dopamine Receptors and Groups I and II mGluRs Cooperate for Long-Term Depression Induction in Rat Prefrontal Cortex through Converging Postsynaptic Activation of MAP Kinases , 1999, The Journal of Neuroscience.

[43]  P. Calabresi,et al.  Unilateral dopamine denervation blocks corticostriatal LTP. , 1999, Journal of neurophysiology.

[44]  A. Dickinson,et al.  Neuronal coding of prediction errors. , 2000, Annual review of neuroscience.

[45]  J. Horvitz Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events , 2000, Neuroscience.

[46]  Anthony Dickinson,et al.  The 28th Bartlett Memorial Lecture. Causal learning: an associative analysis. , 2001 .

[47]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .