A framework for mesencephalic dopamine systems based on predictive Hebbian learning

We develop a theoretical framework that shows how mesencephalic dopamine systems could distribute to their targets a signal that represents information about future expectations. In particular, we show how activity in the cerebral cortex can make predictions about future receipt of reward and how fluctuations in the activity levels of neurons in diffuse dopamine systems above and below baseline levels would represent errors in these predictions that are delivered to cortical and subcortical targets. We present a model for how such errors could be constructed in a real brain that is consistent with physiological results for a subset of dopaminergic neurons located in the ventral tegmental area and surrounding dopaminergic neurons. The theory also makes testable predictions about human choice behavior on a simple decision-making task. Furthermore, we show that, through a simple influence on synaptic plasticity, fluctuations in dopamine release can act to change the predictions in an appropriate manner.

[1]  J. Neumann,et al.  Theory of Games and Economic Behavior. , 1945 .

[2]  Frederick Mosteller,et al.  Stochastic Models for Learning , 1956 .

[3]  H. Raiffa,et al.  Games and Decisions: Introduction and Critical Survey. , 1958 .

[4]  R J HERRNSTEIN,et al.  Relative and absolute strength of response as a function of frequency of reinforcement. , 1961, Journal of the experimental analysis of behavior.

[5]  R. Rescorla A theory of pavlovian conditioning: The effectiveness of reinforcement and non-reinforcement , 1972 .

[6]  K. Jellinger,et al.  Brain dopamine and the syndromes of Parkinson and Huntington. Clinical, morphological and neurochemical correlations. , 1973, Journal of the neurological sciences.

[7]  A G Barto,et al.  Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.

[8]  David Abrahamson,et al.  Contemporary Animal Learning Theory , 1981 .

[9]  R. Wise Neuroleptics and operant behavior: The anhedonia hypothesis , 1982, Behavioral and Brain Sciences.

[10]  N. Mackintosh,et al.  Conditioning And Associative Learning , 1983 .

[11]  M. D. Crutcher,et al.  Relations between movement and single cell discharge in the substantia nigra of the behaving monkey , 1983, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[12]  Michael A. Bozarth,et al.  Brain reward circuitry: Four circuit elements “wired” in apparent series , 1984, Brain Research Bulletin.

[13]  A. Damasio,et al.  Knowledge without awareness: an autonomic index of facial recognition by prosopagnosics. , 1985, Science.

[14]  S. Thomas Alexander,et al.  Adaptive Signal Processing , 1986, Texts and Monographs in Computer Science.

[15]  R. Sutton,et al.  Simulation of the classically conditioned nictitating membrane response by a neuron-like adaptive element: Response topography, neuronal firing, and interstimulus intervals , 1986, Behavioural Brain Research.

[16]  S. Grossberg,et al.  Neural dynamics of attentionally modulated Pavlovian conditioning: blocking, interstimulus interval, and secondary reinforcement. , 1987, Applied optics.

[17]  L. Real,et al.  Why are Bumble Bees Risk Averse , 1987 .

[18]  R. Oades,et al.  Ventral tegmental (A10) system: neurobiology. 1. Anatomy and connectivity , 1987, Brain Research Reviews.

[19]  Stephen Grossberg,et al.  Neural dynamics of adaptive timing and temporal discrimination during associative learning , 1989, Neural Networks.

[20]  W. Schultz,et al.  Dopamine neurons of the monkey midbrain: contingencies of responses to active touch during self-initiated arm movements. , 1990, Journal of neurophysiology.

[21]  W. Schultz,et al.  Dopamine neurons of the monkey midbrain: contingencies of responses to stimuli eliciting immediate behavioral reactions. , 1990, Journal of neurophysiology.

[22]  Richard S. Sutton,et al.  Time-Derivative Models of Pavlovian Reinforcement , 1990 .

[23]  G. Di Chiara,et al.  Profound depression of mesolimbic dopamine release after morphine withdrawal in dependent rats. , 1991, European journal of pharmacology.

[24]  永福 智志 The Organization of Learning , 2005, Journal of Cognitive Neuroscience.

[25]  R. Herrnstein Experiments on Stable Suboptimality in Individual Behavior , 1991 .

[26]  P. Goldman-Rakic,et al.  D1 dopamine receptors in prefrontal cortex: involvement in working memory , 1991, Science.

[27]  L. Parsons,et al.  Basal extracellular dopamine is decreased in the rat nucleus accumbens during abstinence from chronic cocaine , 1991, Synapse.

[28]  L A Real,et al.  Animal choice behavior and the evolution of cognitive architecture , 1991, Science.

[29]  Terrence J. Sejnowski,et al.  Using Aperiodic Reinforcement for Directed Self-Organization During Development , 1992, NIPS.

[30]  G. Gessa,et al.  Marked inhibition of mesolimbic dopamine release: a common feature of ethanol, morphine, cocaine and amphetamine abstinence in rats. , 1992, European journal of pharmacology.

[31]  P Dayan,et al.  Expectation learning in the brain using diffuse ascending projections , 1992 .

[32]  W. Schultz Activity of dopamine neurons in the behaving primate , 1992 .

[33]  W. Schultz,et al.  Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[34]  M. Pistis,et al.  Profound decrement of mesolimbic dopaminergic neuronal activity during ethanol withdrawal syndrome in rats: electrophysiological and biochemical evidence. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[35]  M. Gallagher,et al.  The amygdala complex: multiple roles in associative learning and attention. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Terrence J. Sejnowski,et al.  A Novel Reinforcement Model of Birdsong Vocalization Learning , 1994, NIPS.

[37]  T. Robinson,et al.  Withdrawal from morphine or amphetamine: different effects on dopamine in the ventral-medial striatum studied with microdialysis , 1994, Brain Research.

[38]  T. Sejnowski,et al.  The predictive brain: temporal coincidence and temporal order in synaptic learning mechanisms. , 1994, Learning & memory.

[39]  A. Damasio,et al.  Insensitivity to future consequences following damage to human prefrontal cortex , 1994, Cognition.

[40]  Peter Dayan,et al.  Bee foraging in uncertain environments using predictive hebbian learning , 1995, Nature.

[41]  J. Wickens,et al.  Cellular models of reinforcement. , 1995 .

[42]  P. Montague Biological Substrates of Predictive Mechanisms in Learning and Action Choice , 1997 .