Temporal Difference Model Reproduces Anticipatory Neural Activity

Anticipatory neural activity preceding behaviorally important events has been reported in cortex, striatum, and midbrain dopamine neurons. Whereas dopamine neurons are phasically activated by reward-predictive stimuli, anticipatory activity of cortical and striatal neurons is increased during delay periods before important events. Characteristics of dopa-mine neuron activity resemble those of the prediction error signal of the temporal difference (TD) model of Pavlovian learning (Sutton & Barto, 1990). This study demonstrates that the prediction signal of the TD model reproduces characteristics of cortical and striatal anticipatory neural activity. This finding suggests that tonic anticipatory activities may reflect prediction signals that are involved in the processing of dopamine neuron activity.

[1]  L. S. Kogan Review of Principles of Behavior. , 1943 .

[2]  B. Skinner,et al.  Principles of Behavior , 1944 .

[3]  E. Fischer Conditioned Reflexes , 1942, American journal of physical medicine.

[4]  R. Rescorla,et al.  A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[5]  N. Mackintosh The psychology of animal learning , 1974 .

[6]  W. K. Honig,et al.  Cognitive Processes in Animal Behavior , 1979 .

[7]  S. Lea,et al.  Contemporary Animal Learning Theory, Anthony Dickinson. Cambridge University Press, Cambridge (1981), xii, +177 pp. £12.50 hardback, £3.95 paperback , 1981 .

[8]  T. Teyler,et al.  Long-term potentiation. , 1987, Annual review of neuroscience.

[9]  Stephen Grossberg,et al.  Neural dynamics of adaptive timing and temporal discrimination during associative learning , 1989, Neural Networks.

[10]  O. Hikosaka,et al.  Functional properties of monkey caudate neurons. III. Activities related to expectation of target and reward. , 1989, Journal of neurophysiology.

[11]  C. Bruce,et al.  Primate frontal eye fields. III. Maintenance of a spatially accurate saccade signal. , 1990, Journal of neurophysiology.

[12]  G. E. Alexander,et al.  Preparation for movement: neural representations of intended direction in three motor areas of the monkey. , 1990, Journal of neurophysiology.

[13]  G E Alexander,et al.  Neural representations of the target (goal) of visually guided arm movements in three motor areas of the monkey. , 1990, Journal of neurophysiology.

[14]  Richard S. Sutton,et al.  Time-Derivative Models of Pavlovian Reinforcement , 1990 .

[15]  M. Gabriel,et al.  Learning and Computational Neuroscience: Foundations of Adaptive Networks , 1990 .

[16]  永福 智志 The Organization of Learning , 2005, Journal of Cognitive Neuroscience.

[17]  P. Calabresi,et al.  Long‐term Potentiation in the Striatum is Unmasked by Removing the Voltage‐dependent Magnesium Block of NMDA Receptor Channels , 1992, The European journal of neuroscience.

[18]  W. Schultz,et al.  Responses of monkey dopamine neurons during learning of behavioral reactions. , 1992, Journal of neurophysiology.

[19]  J R Duhamel,et al.  The updating of the representation of visual space in parietal cortex by intended eye movements. , 1992, Science.

[20]  W. Schultz,et al.  Neuronal activity in monkey ventral striatum related to the expectation of reward , 1992, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[21]  W. Schultz,et al.  Neuronal activity in monkey striatum related to the expectation of predictable environmental events. , 1992, Journal of neurophysiology.

[22]  W. Schultz,et al.  Reward-related activity in the monkey striatum and substantia nigra. , 1993, Progress in brain research.

[23]  W. Schultz,et al.  Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[24]  Peter Dayan,et al.  Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.

[25]  Joel L. Davis,et al.  A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .

[26]  W. Schultz,et al.  Importance of unpredictability for reward responses in primate dopamine neurons. , 1994, Journal of neurophysiology.

[27]  J. Joseph,et al.  Activity in the caudate nucleus of monkey during spatial sequencing. , 1995, Journal of neurophysiology.

[28]  O. Hikosaka Models of information processing in the basal Ganglia edited by James C. Houk, Joel L. Davis and David G. Beiser, The MIT Press, 1995. $60.00 (400 pp) ISBN 0 262 08234 9 , 1995, Trends in Neurosciences.

[29]  Peter Ford Dominey,et al.  A Model of Corticostriatal Plasticity for Learning Oculomotor Associations and Sequences , 1995, Journal of Cognitive Neuroscience.

[30]  A. Barto,et al.  Adaptive Critics and the Basal Ganglia , 1994 .

[31]  M. Goldberg,et al.  Neurons in the monkey superior colliculus predict the visual result of impending saccadic eye movements. , 1995, Journal of neurophysiology.

[32]  J. Wickens,et al.  Dopamine reverses the depression of rat corticostriatal synapses which normally follows high-frequency stimulation of cortex In vitro , 1996, Neuroscience.

[33]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[34]  Masataka Watanabe Reward expectancy in primate prefrental neurons , 1996, Nature.

[35]  P. Calabresi,et al.  Abnormal Synaptic Plasticity in the Striatum of Mice Lacking Dopamine D2 Receptors , 1997, The Journal of Neuroscience.

[36]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[37]  H. T. Blair,et al.  Anticipatory time intervals of head-direction cells in the anterior thalamus of the rat: implications for path integration in the head-direction circuit. , 1997, Journal of neurophysiology.

[38]  Rajesh P. N. Rao,et al.  Dynamic Model of Visual Recognition Predicts Neural Response Properties in the Visual Cortex , 1997, Neural Computation.

[39]  M. Goldberg,et al.  Spatial processing in the monkey frontal eye field. I. Predictive visual responses. , 1997, Journal of neurophysiology.

[40]  W. Schultz,et al.  Learning of sequential movements by neural network model with dopamine-like reinforcement signal , 1998, Experimental Brain Research.

[41]  J. Hollerman,et al.  Influence of reward expectation on behavior-related neuronal activity in primate striatum. , 1998, Journal of neurophysiology.

[42]  J. Hollerman,et al.  Dopamine neurons report an error in the temporal prediction of reward during learning , 1998, Nature Neuroscience.

[43]  M. Hasselmo,et al.  The hippocampus as an associator of discontiguous events , 1998, Trends in Neurosciences.

[44]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[45]  J. Hollerman,et al.  Modifications of reward expectation-related neuronal activity during learning in primate striatum. , 1998, Journal of neurophysiology.

[46]  W. Schultz,et al.  A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task , 1999, Neuroscience.

[47]  Joshua W. Brown,et al.  How the Basal Ganglia Use Parallel Excitatory and Inhibitory Learning Pathways to Selectively Respond to Unexpected Rewarding Cues , 1999, The Journal of Neuroscience.

[48]  Rajesh P. N. Rao,et al.  Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. , 1999 .

[49]  W. Schultz,et al.  Relative reward preference in primate orbitofrontal cortex , 1999, Nature.

[50]  J. Hollerman,et al.  Reward processing in primate orbitofrontal cortex and basal ganglia. , 2000, Cerebral cortex.

[51]  W. Schultz,et al.  Reward-related neuronal activity during go-nogo task performance in primate orbitofrontal cortex. , 2000, Journal of neurophysiology.

[52]  M. Arbib,et al.  Modeling functions of striatal dopamine modulation in learning and planning , 2001, Neuroscience.

[53]  Kenji Doya,et al.  Neural mechanisms of learning and control , 2001 .

[54]  Michael E. Hasselmo,et al.  Faculty Opinions recommendation of A framework for mesencephalic dopamine systems based on predictive Hebbian learning. , 2001 .

[55]  Yves Burnod,et al.  An integrative theory of the phasic and tonic modes of dopamine modulation in the prefrontal cortex , 2002, Neural Networks.

[56]  Jonghan Shin,et al.  A unifying theory on the relationship between spike trains, EEG, and ERP based on the noise shaping/predictive neural coding hypothesis. , 2002, Bio Systems.

[57]  P. Dayan,et al.  Reward, Motivation, and Reinforcement Learning , 2002, Neuron.

[58]  R. Peterson "Buy on the Rumor:" Anticipatory Affect and Investor Behavior , 2002 .

[59]  Kae Nakamura,et al.  Updating of the visual representation in monkey striate and extrastriate cortex during saccades , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[60]  Y. Burnod,et al.  A Model of Prefrontal Cortex Dopaminergic Modulation during the Delayed Alternation Task , 2002, Journal of Cognitive Neuroscience.

[61]  Olaf Sporns,et al.  Neuromodulation and plasticity in an autonomous robot , 2002, Neural Networks.

[62]  Roland E. Suri,et al.  TD models of reward predictive responses in dopamine neurons , 2002, Neural Networks.