Temporal Sequence Learning, Prediction, and Control: A Review of Different Models and Their Relation to Biological Mechanisms

In this review, we compare methods for temporal sequence learning (TSL) across the disciplines machine-control, classical conditioning, neuronal models for TSL as well as spike-timing-dependent plasticity (STDP). This review introduces the most influential models and focuses on two questions: To what degree are reward-based (e.g., TD learning) and correlation-based (Hebbian) learning related? and How do the different models correspond to possibly underlying biological mechanisms of synaptic plasticity? We first compare the different models in an open-loop condition, where behavioral feedback does not alter the learning. Here we observe that reward-based and correlation-based learning are indeed very similar. Machine control is then used to introduce the problem of closed-loop control (e.g., actor-critic architectures). Here the problem of evaluative (rewards) versus nonevaluative (correlations) feedback from the environment will be discussed, showing that both learning approaches are fundamentally different in the closed-loop condition. In trying to answer the second question, we compare neuronal versions of the different learning architectures to the anatomy of the involved brain structures (basal-ganglia, thalamus, and cortex) and the molecular biophysics of glutamatergic and dopaminergic synapses. Finally, we discuss the different algorithms used to model STDP and compare them to reward-based learning rules. Certain similarities are found in spite of the strongly different timescales. Here we focus on the biophysics of the different calcium-release mechanisms known to be involved in STDP.

[1]  C. L. Hull The problem of stimulus equivalence in behavior theory. , 1939 .

[2]  D. Whitteridge Lectures on Conditioned Reflexes , 1942, Nature.

[3]  B. Skinner,et al.  Principles of Behavior , 1944 .

[4]  C. L. Hull Principles of Behavior , 1945 .

[5]  F. Attneave,et al.  The Organization of Behavior: A Neuropsychological Theory , 1949 .

[6]  Kenneth L. Artis Design for a Brain , 1961 .

[7]  W. F. Prokasy,et al.  Adaptation, Sensitization, Forward and Backward Conditioning, and Pseudoconditioning of the GSR , 1962 .

[8]  T. Bliss,et al.  Plasticity in a monosynaptic cortical pathway. , 1970, The Journal of physiology.

[9]  A. H. Klopf,et al.  Brain Function and Adaptive Systems: A Heterostatic Theory , 1972 .

[10]  R. Rescorla A theory of pavlovian conditioning: The effectiveness of reinforcement and non-reinforcement , 1972 .

[11]  T. Bliss,et al.  Long‐lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path , 1973, The Journal of physiology.

[12]  N. Mackintosh The psychology of animal learning , 1974 .

[13]  Ian H. Witten,et al.  An Adaptive Optimal Controller for Discrete-Time Markov Environments , 1977, Inf. Control..

[14]  J. C. Stoof,et al.  Opposing roles for D-1 and D-2 dopamine receptors in efflux of cyclic AMP from rat neostriatum , 1981, Nature.

[15]  J. D. Miller,et al.  Mesencephalic dopaminergic unit activity in the behaviorally conditioned rat. , 1981, Life sciences.

[16]  A G Barto,et al.  Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.

[17]  C. Y. Yim,et al.  Response of nucleus accumbens neurons to amygdala stimulation and its modification by dopamine , 1982, Brain Research.

[18]  E. Bienenstock,et al.  Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex , 1982, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[19]  M. Memo,et al.  Agonist-induced subsensitivity of adenylate cyclase coupled with a dopamine receptor in slices from rat corpus striatum. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[20]  N. Mackintosh,et al.  Conditioning And Associative Learning , 1983 .

[21]  W. Levy,et al.  Temporal contiguity requirements for long-term associative potentiation/depression in the hippocampus , 1983, Neuroscience.

[22]  John S. Edwards,et al.  The Hedonistic Neuron: A Theory of Memory, Learning and Intelligence , 1983 .

[23]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[24]  J. Brown,et al.  The electrophysiology of dopamine (D2) receptors: A study of the actions of dopamine on corticostriatal transmission , 1983, Neuroscience.

[25]  P. Greengard,et al.  DARPP-32, a dopamine-regulated neuronal phosphoprotein, is a potent inhibitor of protein phosphatase-1 , 1984, Nature.

[26]  Richard S. Sutton,et al.  Temporal credit assignment in reinforcement learning , 1984 .

[27]  L. Nowak,et al.  Magnesium gates glutamate-activated channels in mouse central neurones , 1984, Nature.

[28]  P. Greengard,et al.  Mammalian brain phosphoproteins as substrates for calcineurin. , 1984, The Journal of biological chemistry.

[29]  C. Gerfen The neostriatal mosaic: compartmentalization of corticostriatal input and striatonigral output systems , 1984, Nature.

[30]  M. Mayer,et al.  Voltage-dependent block by Mg2+ of NMDA responses in spinal cord neurones , 1984, Nature.

[31]  W. Schultz,et al.  Responses of rat pallidum cells to cortex stimulation and effects of altered dopaminergic activity , 1985, Neuroscience.

[32]  C. Gerfen The neostriatal mosaic. I. compartmental organization of projections from the striatum to the substantia nigra in the rat , 1985, The Journal of comparative neurology.

[33]  A. Harry Klopf,et al.  A drive-reinforcement model of single neuron function , 1987 .

[34]  W. Schultz Responses of midbrain dopamine neurons to behavioral trigger stimuli in the monkey. , 1986, Journal of neurophysiology.

[35]  B. Kosco Differential Hebbian learning , 1987 .

[36]  P. Calabresi,et al.  Intracellular studies on the dopamine-induced firing inhibition of neostriatal neurons in vitro: Evidence for D1 receptor involvement , 1987, Neuroscience.

[37]  B. Gustafsson,et al.  Long-term potentiation in the hippocampus using depolarizing current pulses as the conditioning stimulus to single volley synaptic potentials , 1987, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[38]  J. Joyce,et al.  Quantitative autoradiography of dopamine D2 sites in rat caudate-putamen: Localization to intrinsic neurons and not to neocortical afferents , 1987, Neuroscience.

[39]  C. Gerfen,et al.  The neostriatal mosaic: II. Patch- and matrix-directed mesostriatal dopaminergic and non-dopaminergic systems , 1987, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[40]  Bernard Widrow,et al.  Adaptive switching circuits , 1988 .

[41]  A. Klopf A neuronal model of classical conditioning , 1988 .

[42]  B. Berger,et al.  Regional and laminar distribution of the dopamine and serotonin innervation in the macaque cerebral cortex: A radioautographic study , 1988, The Journal of comparative neurology.

[43]  Stephen Grossberg,et al.  Art 2: Self-Organization Of Stable Category Recognition Codes For Analog Input Patterns , 1988, Other Conferences.

[44]  Stephen Grossberg,et al.  Neural dynamics of adaptive timing and temporal discrimination during associative learning , 1989, Neural Networks.

[45]  C. Watkins Learning from delayed rewards , 1989 .

[46]  D. McFarland Problems of animal behaviour , 1989 .

[47]  R. Tsien,et al.  Inhibition of postsynaptic PKC or CaMKII blocks induction but not expression of LTP. , 1989, Science.

[48]  J. Lisman,et al.  A mechanism for the Hebb and the anti-Hebb processes underlying learning and memory. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[49]  R. Nicoll,et al.  An essential role for postsynaptic calmodulin and protein kinase activity in long-term potentiation , 1989, Nature.

[50]  C. Altar,et al.  Discriminatory roles for D1 and D2 dopamine receptor subtypes in the in vivo control of neostriatal cyclic GMP. , 1990, European journal of pharmacology.

[51]  A. Parent Extrinsic connections of the basal ganglia , 1990, Trends in Neurosciences.

[52]  Richard S. Sutton,et al.  Time-Derivative Models of Pavlovian Reinforcement , 1990 .

[53]  M. Gabriel,et al.  Learning and Computational Neuroscience: Foundations of Adaptive Networks , 1990 .

[54]  H. Groenewegen,et al.  The anatomical relationship of the prefrontal cortex with the striatopallidal system, the thalamus and the amygdala: evidence for a parallel organization. , 1990, Progress in brain research.

[55]  Daniel C. Dennett,et al.  Cognitive Wheels: The Frame Problem of AI , 1990, The Philosophy of Artificial Intelligence.

[56]  A. Grace,et al.  Midbrain dopamine system electrophysiological functioning: A review and new hypothesis , 1991, Synapse.

[57]  P. Calabresi,et al.  Long‐term Potentiation in the Striatum is Unmasked by Removing the Voltage‐dependent Magnesium Block of NMDA Receptor Channels , 1992, The European journal of neuroscience.

[58]  W. Schultz,et al.  Responses of monkey dopamine neurons during learning of behavioral reactions. , 1992, Journal of neurophysiology.

[59]  P. Calabresi,et al.  Coactivation of D1 and D2 dopamine receptors is required for long-term synaptic depression in the striatum , 1992, Neuroscience Letters.

[60]  G. Tesauro Practical Issues in Temporal Difference Learning , 1992 .

[61]  C. Gerfen The neostriatal mosaic: multiple levels of compartmental organization in the basal ganglia. , 1992, Annual review of neuroscience.

[62]  P. Calabresi,et al.  Long-term synaptic depression in the striatum: physiological and pharmacological characterization , 1992, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[63]  A. Konnerth,et al.  Sodium action potentials in the dendrites of cerebellar Purkinje cells. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[64]  W. Schultz,et al.  Neuronal activity in monkey ventral striatum related to the expectation of reward , 1992, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[65]  W. N. Ross,et al.  Imaging voltage and synaptically activated sodium transients in cerebellar Purkinje cells , 1992, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[66]  D. Sibley,et al.  Molecular biology of dopamine receptors. , 1992, Trends in pharmacological sciences.

[67]  D. Lovinger,et al.  Short- and long-term synaptic depression in rat neostriatum. , 1993, Journal of neurophysiology.

[68]  W. Singer,et al.  Long-term depression of excitatory synaptic transmission and its relationship to long-term potentiation , 1993, Trends in Neurosciences.

[69]  D. Surmeier,et al.  D1 and D2 dopamine receptor modulation of sodium and potassium currents in rat neostriatal neurons. , 1993, Progress in brain research.

[70]  P. Goldman-Rakic,et al.  Characterization of the dopaminergic innervation of the primate frontal cortex using a dopamine-specific antibody. , 1993, Cerebral cortex.

[71]  F. H. Lopes da Silva,et al.  Synaptic Plasticity in an In Vitro Slice Preparation of the Rat Nucleus Accumbens , 1993, The European journal of neuroscience.

[72]  J. Walsh,et al.  Synaptic activation of N-methyl-d-aspartate receptors induces short-term potentiation at excitatory synapses in the striatum of the rat , 1993, Neuroscience.

[73]  J. Walsh Depression of excitatory synaptic input in rat striatal neurons , 1993, Brain Research.

[74]  C. Anderson,et al.  Multigrid Q-learning , 1994 .

[75]  Joel L. Davis,et al.  A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .

[76]  Michael I. Jordan,et al.  MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[77]  S. Haber,et al.  Primate striatonigral projections: A comparison of the sensorimotor‐related striatum and the ventral striatum , 1994, The Journal of comparative neurology.

[78]  W. Schultz,et al.  Importance of unpredictability for reward responses in primate dopamine neurons. , 1994, Journal of neurophysiology.

[79]  James C. Houk,et al.  Elements of the Intrinsic Organization and Information Processing in the Neostriatum , 1994 .

[80]  D. Debanne,et al.  Asynchronous pre- and postsynaptic activity induces associative long-term depression in area CA1 of the rat hippocampus in vitro. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[81]  B. Sakmann,et al.  Active propagation of somatic action potentials into neocortical pyramidal cell dendrites , 1994, Nature.

[82]  M. L. Pucak,et al.  Regulation of substantia nigra dopamine neurons. , 1994, Critical reviews in neurobiology.

[83]  R. Malenka,et al.  Involvement of a calcineurin/ inhibitor-1 phosphatase cascade in hippocampal long-term depression , 1994, Nature.

[84]  P. Greengard,et al.  Modulation of calcium currents by a D1 dopaminergic protein kinase/phosphatase cascade in rat neostriatal neurons , 1995, Neuron.

[85]  S Grossberg,et al.  A spectral network model of pitch perception. , 1995, The Journal of the Acoustical Society of America.

[86]  A. Barto,et al.  Adaptive Critics and the Basal Ganglia , 1994 .

[87]  Gavin Adrian Rummery Problem solving with reinforcement learning , 1995 .

[88]  Joel L. Davis,et al.  Adaptive Critics and the Basal Ganglia , 1995 .

[89]  Richard S. Sutton,et al.  Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.

[90]  Peter Dayan,et al.  Bee foraging in uncertain environments using predictive hebbian learning , 1995, Nature.

[91]  P. Calabresi,et al.  Transmitter Release Associated with Long‐term Synaptic Depression in Rat Corticostriatal Slices , 1995, The European journal of neuroscience.

[92]  G. Buzsáki,et al.  Pattern and inhibition-dependent invasion of pyramidal cell dendrites by fast spikes in the hippocampus in vivo. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[93]  R. Huganir,et al.  Characterization of Multiple Phosphorylation Sites on the AMPA Receptor GluR1 Subunit , 1996, Neuron.

[94]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[95]  J. Wickens,et al.  Dopamine reverses the depression of rat corticostriatal synapses which normally follows high-frequency stimulation of cortex In vitro , 1996, Neuroscience.

[96]  Wulfram Gerstner,et al.  A neuronal learning rule for sub-millisecond temporal coding , 1996, Nature.

[97]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[98]  D. Johnston,et al.  Active properties of neuronal dendrites. , 1996, Annual review of neuroscience.

[99]  P. Calabresi,et al.  The corticostriatal projection: from synaptic plasticity to dysfunctions of the basal ganglia , 1996, Trends in Neurosciences.

[100]  S. Grossberg,et al.  The Hippocampus and Cerebellum in Adaptively Timed Learning, Recognition, and Movement , 1996, Journal of Cognitive Neuroscience.

[101]  Jordan B. Pollack,et al.  Why did TD-Gammon Work? , 1996, NIPS.

[102]  J. L. Martínez,et al.  Long-term potentiation and learning. , 1996, Annual review of psychology.

[103]  K. I. Blum,et al.  Functional significance of long-term potentiation for sequence learning and prediction. , 1996, Cerebral cortex.

[104]  V. Han,et al.  Synaptic plasticity in a cerebellum-like structure depends on temporal order , 1997, Nature.

[105]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[106]  J. Bargas,et al.  D1 Receptor Activation Enhances Evoked Discharge in Neostriatal Medium Spiny Neurons by Modulating an L-Type Ca2+ Conductance , 1997, The Journal of Neuroscience.

[107]  P. Greengard,et al.  Bidirectional Regulation of DARPP-32 Phosphorylation by Dopamine , 1997, The Journal of Neuroscience.

[108]  D. Johnston,et al.  K+ channel regulation of signal propagation in dendrites of hippocampal pyramidal neurons , 1997, Nature.

[109]  B. Morris,et al.  Dynamic changes in NADPH-diaphorase staining reflect activity of nitric oxide synthase: Evidence for a dopaminergic regulation of striatal nitric oxide release , 1997, Neuropharmacology.

[110]  D. Lovinger,et al.  Decreased probability of neurotransmitter release underlies striatal long-term depression and postnatal development of corticostriatal synapses. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[111]  S. Hoffman,et al.  Funding for malaria genome sequencing , 1997, Nature.

[112]  N. Spruston,et al.  Action potential initiation and backpropagation in neurons of the mammalian CNS , 1997, Trends in Neurosciences.

[113]  P. Overton,et al.  Burst firing in midbrain dopaminergic neurons , 1997, Brain Research Reviews.

[114]  D. Johnston,et al.  A Synaptically Controlled, Associative Signal for Hebbian Plasticity in Hippocampal Neurons , 1997, Science.

[115]  H. Sebastian Seung,et al.  Learning Continuous Attractors in Recurrent Networks , 1997, NIPS.

[116]  R. Huganir,et al.  Characterization of Protein Kinase A and Protein Kinase C Phosphorylation of the N-Methyl-D-aspartate Receptor NR1 Subunit Using Phosphorylation Site-specific Antibodies* , 1997, The Journal of Biological Chemistry.

[117]  M. Umemiya,et al.  Dopaminergic modulation of excitatory postsynaptic currents in rat neostriatal neurons. , 1997, Journal of neurophysiology.

[118]  D. Johnston,et al.  Regulation of Synaptic Efficacy by Coincidence of Postsynaptic APs and EPSPs , 1997 .

[119]  B. Sakmann,et al.  Calcium action potentials restricted to distal apical dendrites of rat neocortical pyramidal neurons , 1997, The Journal of physiology.

[120]  S. Charpier,et al.  In vivo activity-dependent plasticity at cortico-striatal connections: evidence for physiological long-term potentiation. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[121]  K. Deisseroth,et al.  Translocation of calmodulin to the nucleus supports CREB phosphorylation in hippocampal neurons , 1998, Nature.

[122]  W. Schultz,et al.  Learning of sequential movements by neural network model with dopamine-like reinforcement signal , 1998, Experimental Brain Research.

[123]  P. Greengard,et al.  Activation of adenosine A2A and dopamine D1 receptors stimulates cyclic AMP-dependent phosphorylation of DARPP-32 in distinct populations of striatal projection neurons , 1998, Neuroscience.

[124]  J. Hollerman,et al.  Influence of reward expectation on behavior-related neuronal activity in primate striatum. , 1998, Journal of neurophysiology.

[125]  G. Schoenbaum,et al.  Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning , 1998, Nature Neuroscience.

[126]  J. Hollerman,et al.  Dopamine neurons report an error in the temporal prediction of reward during learning , 1998, Nature Neuroscience.

[127]  B. Sakmann,et al.  Calcium dynamics in single spines during coincident pre- and postsynaptic activity depend on relative timing of back-propagating action potentials and subthreshold excitatory postsynaptic potentials. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[128]  M. Berridge Neuronal Calcium Signaling , 1998, Neuron.

[129]  Sen Song,et al.  Temporally Asymmetric Hebbian Learning, Spike liming and Neural Response Variability , 1998, NIPS.

[130]  T. Sejnowski,et al.  A Computational Model of How the Basal Ganglia Produce Sequences , 1998, Journal of Cognitive Neuroscience.

[131]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[132]  P. De Koninck,et al.  Sensitivity of CaM kinase II to the frequency of Ca2+ oscillations. , 1998, Science.

[133]  Li I. Zhang,et al.  A critical window for cooperation and competition among developing retinotectal synapses , 1998, Nature.

[134]  Christian Balkenius,et al.  Computational models of classical conditioning: a comparative study , 1998 .

[135]  O. Hikosaka,et al.  Expectation of reward modulates cognitive signals in the basal ganglia , 1998, Nature Neuroscience.

[136]  J. Hollerman,et al.  Modifications of reward expectation-related neuronal activity during learning in primate striatum. , 1998, Journal of neurophysiology.

[137]  G. Bi,et al.  Synaptic Modifications in Cultured Hippocampal Neurons: Dependence on Spike Timing, Synaptic Strength, and Postsynaptic Cell Type , 1998, The Journal of Neuroscience.

[138]  D. Debanne,et al.  Long‐term synaptic plasticity between pairs of individual CA3 pyramidal cells in rat hippocampal slice cultures , 1998, The Journal of physiology.

[139]  K. Berridge,et al.  What is the role of dopamine in reward: hedonic impact, reward learning, or incentive salience? , 1998, Brain Research Reviews.

[140]  Jack D. Cowan,et al.  DYNAMICS OF SELF-ORGANIZED DELAY ADAPTATION , 1999 .

[141]  R. Kempter,et al.  Hebbian learning and spiking neurons , 1999 .

[142]  D. Linden The Return of the Spike Postsynaptic Action Potentials and the Induction of LTP and LTD , 1999, Neuron.

[143]  Claude Touzet,et al.  Dynamic Update of the Reinforcement Function During Learning , 1999, Connect. Sci..

[144]  W. Schultz,et al.  A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task , 1999, Neuroscience.

[145]  C. Frith,et al.  Orbitofrontal cortex is activated during breaches of expectation in tasks of visual attention , 1999, Nature Neuroscience.

[146]  K. Svoboda,et al.  Synaptic [Ca2+] Intracellular Stores Spill Their Guts , 1999, Neuron.

[147]  Joshua W. Brown,et al.  How the Basal Ganglia Use Parallel Excitatory and Inhibitory Learning Pathways to Selectively Respond to Unexpected Rewarding Cues , 1999, The Journal of Neuroscience.

[148]  P. Redgrave,et al.  Is the short-latency dopamine response too short to signal reward error? , 1999, Trends in Neurosciences.

[149]  O. Paulsen,et al.  Rapid report: postsynaptic bursting is essential for 'Hebbian' induction of associative long-term potentiation at excitatory synapses in rat hippocampus. , 1999, The Journal of physiology.

[150]  R. Zucker,et al.  Selective induction of LTP and LTD by postsynaptic [Ca2+]i elevation. , 1999, Journal of neurophysiology.

[151]  P. Calabresi,et al.  Glutamate-Triggered Events Inducing Corticostriatal Long-Term Depression , 1999, The Journal of Neuroscience.

[152]  S. Charpier,et al.  In vivo induction of striatal long-term potentiation by low-frequency stimulation of the cerebral cortex , 1999, Neuroscience.

[153]  K. Mikoshiba,et al.  Facilitation of NMDAR-Independent LTP and Spatial Learning in Mutant Mice Lacking Ryanodine Receptor Type 3 , 1999, Neuron.

[154]  R. Zucker Calcium- and activity-dependent synaptic plasticity , 1999, Current Opinion in Neurobiology.

[155]  K. Deisseroth,et al.  L-type calcium channels and GSK-3 regulate the activity of NF-ATc4 in hippocampal neurons , 1999, Nature.

[156]  Claude F. Touzet,et al.  Neural Networks and Q-Learning for Robotics , 1999 .

[157]  H. Kita,et al.  Expression of N-methyl-d-aspartate receptor-dependent long-term potentiation in the neostriatal neurons in an in vitro slice after ethanol withdrawal of the rat , 1999, Neuroscience.

[158]  P. Greengard,et al.  Beyond the Dopamine Receptor: Review the DARPP-32/Protein Phosphatase-1 Cascade , 1999 .

[159]  P. Greengard,et al.  Phosphorylation of DARPP-32 by Cdk5 modulates dopamine signalling in neurons , 1999, Nature.

[160]  R. Nicoll,et al.  Long-term potentiation--a decade of progress? , 1999, Science.

[161]  W. Schultz,et al.  Relative reward preference in primate orbitofrontal cortex , 1999, Nature.

[162]  Juan Miguel Santos,et al.  Exploration tuned reinforcement function , 1999, Neurocomputing.

[163]  B. Sakmann,et al.  Coincidence detection and changes of synaptic efficacy in spiny stellate neurons in rat barrel cortex , 1999, Nature Neuroscience.

[164]  P. Calabresi,et al.  Unilateral dopamine denervation blocks corticostriatal LTP. , 1999, Journal of neurophysiology.

[165]  Xiaohui Xie,et al.  Spike-based Learning Rules and Stabilization of Persistent Neural Activity , 1999, NIPS.

[166]  Richard S. Sutton,et al.  Open Theoretical Questions in Reinforcement Learning , 1999, EuroCOLT.

[167]  M. Bennett,et al.  The concept of long term potentiation of transmission at synapses , 2000, Progress in Neurobiology.

[168]  Mark C. W. van Rossum,et al.  Stable Hebbian Learning from Spike Timing-Dependent Plasticity , 2000, The Journal of Neuroscience.

[169]  A. Dickinson,et al.  Neuronal coding of prediction errors. , 2000, Annual review of neuroscience.

[170]  N. Spruston,et al.  Diversity and dynamics of dendritic signaling. , 2000, Science.

[171]  V. Han,et al.  Reversible Associative Depression and Nonassociative Potentiation at a Parallel Fiber Synapse , 2000, Neuron.

[172]  G. Akopian,et al.  Functional state of corticostriatal synapses determines their expression of short‐ and long‐term plasticity , 2000, Synapse.

[173]  J. Spencer,et al.  Bi-directional changes in synaptic plasticity induced at corticostriatal synapses in vitro , 2000, Experimental Brain Research.

[174]  Kenji Doya,et al.  Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[175]  L. Abbott,et al.  Competitive Hebbian learning through spike-timing-dependent synaptic plasticity , 2000, Nature Neuroscience.

[176]  M. Poo,et al.  Calcium stores regulate the polarity and input specificity of synaptic modification , 2000, Nature.

[177]  S. Kakade,et al.  Learning and selective attention , 2000, Nature Neuroscience.

[178]  J. Leo van Hemmen,et al.  Modeling Synaptic Plasticity in Conjunction with the Timing of Pre- and Postsynaptic Action Potentials , 2000, Neural Computation.

[179]  R. Malenka,et al.  Dopaminergic modulation of neuronal excitability in the striatum and nucleus accumbens. , 2000, Annual review of neuroscience.

[180]  S. J. Martin,et al.  Synaptic plasticity and memory: an evaluation of the hypothesis. , 2000, Annual review of neuroscience.

[181]  Karl J. Friston,et al.  Dissociable Neural Responses in Human Reward Systems , 2000, The Journal of Neuroscience.

[182]  D. Feldman,et al.  Timing-Based LTP and LTD at Vertical Inputs to Layer II/III Pyramidal Cells in Rat Barrel Cortex , 2000, Neuron.

[183]  W. Hauber,et al.  Nmda, but Not Dopamine D 2 , Receptors in the Rat Nucleus Accumbens Are Involved in Guidance of Instrumental Behavior by Stimuli Predicting Reward Magnitude , 2022 .

[184]  D. Joel,et al.  The connections of the dopaminergic system with the striatum in rats and primates: an analysis with respect to the functional and compartmental organization of the striatum , 2000, Neuroscience.

[185]  Nikolaus R. McFarland,et al.  Striatonigrostriatal Pathways in Primates Form an Ascending Spiral from the Shell to the Dorsolateral Striatum , 2000, The Journal of Neuroscience.

[186]  P. Greengard,et al.  Dopamine and cAMP-Regulated Phosphoprotein 32 kDa Controls Both Striatal Long-Term Depression and Long-Term Potentiation, Opposing Forms of Synaptic Plasticity , 2000, The Journal of Neuroscience.

[187]  J. Partridge,et al.  Regional and postnatal heterogeneity of activity-dependent long-term changes in synaptic efficacy in the dorsal striatum. , 2000, Journal of neurophysiology.

[188]  L. Nystrom,et al.  Tracking the hemodynamic responses to reward and punishment in the striatum. , 2000, Journal of neurophysiology.

[189]  Samuel M. McClure,et al.  Predictability Modulates Human Brain Response to Reward , 2001, The Journal of Neuroscience.

[190]  Henry Markram,et al.  An Algorithm for Modifying Neurotransmitter Release Probability Based on Pre- and Postsynaptic Spike Timing , 2001, Neural Computation.

[191]  W. Schultz,et al.  Influence of expectation of different rewards on behavior-related neuronal activity in the striatum. , 2001, Journal of neurophysiology.

[192]  Y. Dan,et al.  Stimulus Timing-Dependent Plasticity in Cortical Processing of Orientation , 2001, Neuron.

[193]  P. Calabresi,et al.  Dopaminergic control of synaptic plasticity in the dorsal striatum , 2001, The European journal of neuroscience.

[194]  Daniel D. Lee,et al.  Equilibrium properties of temporally asymmetric Hebbian plasticity. , 2000, Physical review letters.

[195]  L. Abbott,et al.  Cortical Development and Remapping through Spike Timing-Dependent Plasticity , 2001, Neuron.

[196]  R. Kempter,et al.  Temporal map formation in the barn owl's brain. , 2001, Physical review letters.

[197]  D. Kahneman,et al.  Functional Imaging of Neural Responses to Expectancy and Experience of Monetary Gains and Losses tasks with monetary payoffs , 2001 .

[198]  A. West,et al.  Calcium regulation of neuronal gene expression , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[199]  K. Tang,et al.  Dopamine-dependent synaptic plasticity in striatum during in vivo development. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[200]  Nace L. Golding,et al.  Compartmental Models Simulating a Dichotomy of Action Potential Backpropagation in Ca1 Pyramidal Neuron Dendrites , 2001, Journal of neurophysiology.

[201]  Wulfram Gerstner,et al.  Intrinsic Stabilization of Output Rates by Spike-Based Hebbian Learning , 2001, Neural Computation.

[202]  Rajesh P. N. Rao,et al.  Spike-Timing-Dependent Hebbian Plasticity as Temporal Difference Learning , 2001, Neural Computation.

[203]  Isaac Meilijson,et al.  Distributed synchrony in a cell assembly of spiking neurons , 2001, Neural Networks.

[204]  P. J. Sjöström,et al.  Rate, Timing, and Cooperativity Jointly Determine Cortical Synaptic Plasticity , 2001, Neuron.

[205]  Brian Knutson,et al.  Dissociation of reward anticipation and outcome with event-related fMRI , 2001, Neuroreport.

[206]  R. Kempter,et al.  Formation of temporal-feature maps by axonal propagation of synaptic learning , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[207]  M. Arbib,et al.  Modeling functions of striatal dopamine modulation in learning and planning , 2001, Neuroscience.

[208]  J. Leo van Hemmen,et al.  Temporal receptive fields, spikes, and Hebbian delay selection , 2001, Neural Networks.

[209]  Richard Withey The convergence of convergence , 2001, Aslib Proc..

[210]  C. Lüscher,et al.  Restless AMPA receptors: implications for synaptic transmission and plasticity , 2001, Trends in Neurosciences.

[211]  G. Bi,et al.  Synaptic modification by correlated activity: Hebb's postulate revisited. , 2001, Annual review of neuroscience.

[212]  R. Dolmetsch,et al.  Signaling to the Nucleus by an L-type Calcium Channel-Calmodulin Complex Through the MAP Kinase Pathway , 2001, Science.

[213]  L. Cooper,et al.  A biophysical model of bidirectional synaptic plasticity: Dependence on AMPA and NMDA receptors , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[214]  A. Konnerth,et al.  Stores Not Just for Storage Intracellular Calcium Release and Synaptic Plasticity , 2001, Neuron.

[215]  J. Hemmen Chapter 18 Theory of synaptic plasticity , 2001 .

[216]  E. Rolls,et al.  Representation of pleasant and aversive taste in the human brain. , 2001, Journal of neurophysiology.

[217]  Roland E. Suri,et al.  Temporal Difference Model Reproduces Anticipatory Neural Activity , 2001, Neural Computation.

[218]  Stuart I. Reynolds Reinforcement Learning with Exploration , 2002 .

[219]  I. Song,et al.  Regulation of AMPA receptors during synaptic plasticity , 2002, Trends in Neurosciences.

[220]  Guo-Qiang Bi,et al.  Spatiotemporal specificity of synaptic plasticity: cellular rules and mechanisms , 2002, Biological Cybernetics.

[221]  J. O'Doherty,et al.  Neural Responses during Anticipation of a Primary Taste Reward , 2002, Neuron.

[222]  David C. Sterratt,et al.  Does Morphology Influence Temporal Plasticity? , 2002, ICANN.

[223]  W. Schultz Getting Formal with Dopamine and Reward , 2002, Neuron.

[224]  P. Dayan,et al.  Reward, Motivation, and Reinforcement Learning , 2002, Neuron.

[225]  Dean V. Buonomano,et al.  Mechanisms and significance of spike-timing dependent plasticity , 2002, Biological Cybernetics.

[226]  B. Porr,et al.  Isotropic sequence order learning using a novel linear algorithm in a closed loop behavioural system. , 2002, Bio Systems.

[227]  Werner M. Kistler,et al.  Spike-timing dependent synaptic plasticity: a phenomenological framework , 2002, Biological Cybernetics.

[228]  Nace L. Golding,et al.  Dendritic spikes as a mechanism for cooperative long-term potentiation , 2002, Nature.

[229]  Patrick D. Roberts,et al.  Spike timing dependent synaptic plasticity in biological systems , 2002, Biological Cybernetics.

[230]  Eytan Ruppin,et al.  Actor-critic models of the basal ganglia: new anatomical and computational perspectives , 2002, Neural Networks.

[231]  Katsunori Kitano,et al.  An accurate and widely applicable method to determine the distribution of synaptic strengths formed by the spike-timing-dependent learning , 2002, Neurocomputing.

[232]  Wulfram Gerstner,et al.  Mathematical formulations of Hebbian learning , 2002, Biological Cybernetics.

[233]  L. Cooper,et al.  A unified model of NMDA receptor-dependent bidirectional synaptic plasticity , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[234]  K. Holthoff,et al.  A problem with Hebb and local spikes , 2002, Trends in Neurosciences.

[235]  P. Montague,et al.  Activity in human ventral striatum locked to errors of reward prediction , 2002, Nature Neuroscience.

[236]  J. Leo van Hemmen,et al.  Mapping time , 2002, Biological Cybernetics.

[237]  M. R. Mehta,et al.  Role of experience and oscillations in transforming a rate code into a temporal code , 2002, Nature.

[238]  John N. J. Reynolds,et al.  Dopamine-dependent plasticity of corticostriatal synapses , 2002, Neural Networks.

[239]  P. Dayan Matters temporal , 2002, Trends in Cognitive Sciences.

[240]  Y. Dan,et al.  Spike-timing-dependent synaptic modification induced by natural spike trains , 2002, Nature.

[241]  U. Karmarkar,et al.  A model of spike-timing dependent plasticity: one or two coincidence detectors? , 2002, Journal of neurophysiology.

[242]  H. Abarbanel,et al.  Dynamical model of long-term synaptic plasticity , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[243]  Stefano Fusi,et al.  Hebbian spike-driven synaptic plasticity for learning patterns of mean firing rates , 2002, Biological Cybernetics.

[244]  Stefan Wermter,et al.  Spike-Timing Dependent Competitive Learning of Integrate-and-Fire Neurons with Active Dendrites , 2002, ICANN.

[245]  R. Palmiter,et al.  Reward without Dopamine , 2003, The Journal of Neuroscience.

[246]  Florentin Wörgötter,et al.  Isotropic-sequence-order learning in a closed-loop behavioural system , 2003, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[247]  Florentin Wörgötter,et al.  Isotropic Sequence Order Learning , 2003, Neural Computation.

[248]  Katsunori Kitano,et al.  Time representing cortical activities: two models inspired by prefrontal persistent activity , 2003, Biological Cybernetics.

[249]  Florentin Wörgötter,et al.  ISO Learning Approximates a Solution to the Inverse-Controller Problem in an Unsupervised Behavioral Paradigm , 2003, Neural Computation.

[250]  J. Houk,et al.  Modulation of striatal single units by expected reward: a spiny neuron model displaying dopamine-induced bistability. , 2003, Journal of neurophysiology.

[251]  Karl J. Friston,et al.  Temporal Difference Models and Reward-Related Learning in the Human Brain , 2003, Neuron.

[252]  Ramón Huerta,et al.  Biophysical model of synaptic plasticity dynamics , 2003, Biological Cybernetics.

[253]  Eugene M. Izhikevich,et al.  Relating STDP to BCM , 2003, Neural Computation.

[254]  Naoyuki Sato,et al.  Memory Encoding by Theta Phase Precession in the Hippocampal Network , 2003, Neural Computation.

[255]  N. Daw,et al.  Reinforcement learning models of the dopamine system and their behavioral implications , 2003 .

[256]  Haim Sompolinsky,et al.  Learning Input Correlations through Nonlinear Temporally Asymmetric Hebbian Plasticity , 2003, The Journal of Neuroscience.

[257]  G. Pagnoni,et al.  Human Striatal Response to Salient Nonrewarding Stimuli , 2003, The Journal of Neuroscience.

[258]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[259]  Michele Migliore,et al.  Role of an A-Type K+ Conductance in the Back-Propagation of Action Potentials in the Dendrites of Hippocampal Pyramidal Neurons , 1999, Journal of Computational Neuroscience.

[260]  Peter Dayan,et al.  The convergence of TD(λ) for general λ , 1992, Machine Learning.

[261]  J. Bargas,et al.  Inhibitory action of dopamine involves a subthreshold Cs+-sensitive conductance in neostriatal neurons , 1996, Experimental Brain Research.

[262]  H. Markram,et al.  Coding and learning of behavioral sequences , 2004, Trends in Neurosciences.

[263]  Florentin Wörgötter,et al.  How the Shape of Pre- and Postsynaptic Signals Can Influence STDP: A Biophysical Model , 2004, Neural Computation.

[264]  Richard S. Sutton,et al.  Reinforcement learning with replacing eligibility traces , 2004, Machine Learning.

[265]  José Luis Contreras-Vidal,et al.  A Predictive Reinforcement Model of Dopamine Neurons for Learning Approach Behavior , 1999, Journal of Computational Neuroscience.

[266]  Patrick D. Roberts,et al.  Computational Consequences of Temporally Asymmetric Learning Rules: I. Differential Hebbian Learning , 1999, Journal of Computational Neuroscience.

[267]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[268]  M. Mehta Cooperative LTP can map memory sequences on dendritic branches , 2004, Trends in Neurosciences.

[269]  Walter Senn,et al.  Spike-Based Synaptic Plasticity and the Emergence of Direction Selective Simple Cells: Simulation Results , 2002, Journal of Computational Neuroscience.

[270]  P. Dayan,et al.  TD(λ) converges with probability 1 , 2004, Machine Learning.

[271]  Patrick D. Roberts,et al.  Computational Consequences of Temporally Asymmetric Learning Rules: II. Sensory Image Cancellation , 2000, Journal of Computational Neuroscience.

[272]  S. Wise,et al.  Premotor and supplementary motor cortex in rhesus monkeys: neuronal activity during externally- and internally-instructed motor tasks , 2004, Experimental Brain Research.

[273]  V. Russell,et al.  Regional distribution of monoamines and dopamine D1-and D2-receptors in the striatum of the rat , 1992, Neurochemical Research.

[274]  Jürgen Schmidhuber,et al.  Fast Online Q(λ) , 1998, Machine Learning.

[275]  Walter Senn,et al.  Spike-Based Synaptic Plasticity and the Emergence of Direction Selective Simple Cells: Mathematical Analysis , 2003, Journal of Computational Neuroscience.

[276]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[277]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 2005, IEEE Transactions on Neural Networks.

[278]  T. Bliss Long-lasting potentiation of synaptic transmission , 2005 .

[279]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[280]  Mark D. Humphries,et al.  A robot model of the basal ganglia: Behavior and intrinsic processing , 2006, Neural Networks.

[281]  K. Weigmann Robots emulating children , 2006 .

[282]  Thomas P. Trappenberg,et al.  Rapid learning and robust recall of long sequences in modular associator networks , 2006, Neurocomputing.

[283]  Ramón Huerta,et al.  Generation and reshaping of sequences in neural systems , 2006, Biological Cybernetics.

[284]  Florentin Wörgötter,et al.  Strongly Improved Stability and Faster Convergence of Temporal Sequence Learning by Using Input Correlations Only , 2006, Neural Computation.

[285]  Norbert Krüger,et al.  Symbols as Self-emergent Entities in an Optimization Process of Feature Extraction and Predictions , 2006, Biological Cybernetics.

[286]  Friedemann Pulvermüller,et al.  Language models based on Hebbian cell assemblies , 2006, Journal of Physiology-Paris.

[287]  H. Sompolinsky,et al.  The tempotron: a neuron that learns spike timing–based decisions , 2006, Nature Neuroscience.

[288]  Fred Cummins,et al.  Modeling dopamine activity by Reinforcement Learning methods: implications from two recent models , 2006, Artificial Intelligence Review.