论文信息 - Temporal Sequence Learning, Prediction, and Control: A Review of Different Models and Their Relation to Biological Mechanisms

Temporal Sequence Learning, Prediction, and Control: A Review of Different Models and Their Relation to Biological Mechanisms

In this review, we compare methods for temporal sequence learning (TSL) across the disciplines machine-control, classical conditioning, neuronal models for TSL as well as spike-timing-dependent plasticity (STDP). This review introduces the most influential models and focuses on two questions: To what degree are reward-based (e.g., TD learning) and correlation-based (Hebbian) learning related? and How do the different models correspond to possibly underlying biological mechanisms of synaptic plasticity? We first compare the different models in an open-loop condition, where behavioral feedback does not alter the learning. Here we observe that reward-based and correlation-based learning are indeed very similar. Machine control is then used to introduce the problem of closed-loop control (e.g., actor-critic architectures). Here the problem of evaluative (rewards) versus nonevaluative (correlations) feedback from the environment will be discussed, showing that both learning approaches are fundamentally different in the closed-loop condition. In trying to answer the second question, we compare neuronal versions of the different learning architectures to the anatomy of the involved brain structures (basal-ganglia, thalamus, and cortex) and the molecular biophysics of glutamatergic and dopaminergic synapses. Finally, we discuss the different algorithms used to model STDP and compare them to reward-based learning rules. Certain similarities are found in spite of the strongly different timescales. Here we focus on the biophysics of the different calcium-release mechanisms known to be involved in STDP.

Florentin Wörgötter | Bernd Porr | B. Porr | F. Wörgötter

[1] C. L. Hull. The problem of stimulus equivalence in behavior theory. , 1939 .

[2] D. Whitteridge. Lectures on Conditioned Reflexes , 1942, Nature.

[3] B. Skinner,et al. Principles of Behavior , 1944 .

[4] C. L. Hull. Principles of Behavior , 1945 .

[5] F. Attneave,et al. The Organization of Behavior: A Neuropsychological Theory , 1949 .

[6] Kenneth L. Artis. Design for a Brain , 1961 .

[7] W. F. Prokasy,et al. Adaptation, Sensitization, Forward and Backward Conditioning, and Pseudoconditioning of the GSR , 1962 .

[8] T. Bliss,et al. Plasticity in a monosynaptic cortical pathway. , 1970, The Journal of physiology.

[9] A. H. Klopf,et al. Brain Function and Adaptive Systems: A Heterostatic Theory , 1972 .

[10] R. Rescorla. A theory of pavlovian conditioning: The effectiveness of reinforcement and non-reinforcement , 1972 .

[11] T. Bliss,et al. Long‐lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path , 1973, The Journal of physiology.

[12] N. Mackintosh. The psychology of animal learning , 1974 .

[13] Ian H. Witten,et al. An Adaptive Optimal Controller for Discrete-Time Markov Environments , 1977, Inf. Control..

[14] J. C. Stoof,et al. Opposing roles for D-1 and D-2 dopamine receptors in efflux of cyclic AMP from rat neostriatum , 1981, Nature.

[15] J. D. Miller,et al. Mesencephalic dopaminergic unit activity in the behaviorally conditioned rat. , 1981, Life sciences.

[16] A G Barto,et al. Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.

[17] C. Y. Yim,et al. Response of nucleus accumbens neurons to amygdala stimulation and its modification by dopamine , 1982, Brain Research.

[18] E. Bienenstock,et al. Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex , 1982, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[19] M. Memo,et al. Agonist-induced subsensitivity of adenylate cyclase coupled with a dopamine receptor in slices from rat corpus striatum. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[20] N. Mackintosh,et al. Conditioning And Associative Learning , 1983 .

[21] W. Levy,et al. Temporal contiguity requirements for long-term associative potentiation/depression in the hippocampus , 1983, Neuroscience.

[22] John S. Edwards,et al. The Hedonistic Neuron: A Theory of Memory, Learning and Intelligence , 1983 .

[23] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[24] J. Brown,et al. The electrophysiology of dopamine (D2) receptors: A study of the actions of dopamine on corticostriatal transmission , 1983, Neuroscience.

[25] P. Greengard,et al. DARPP-32, a dopamine-regulated neuronal phosphoprotein, is a potent inhibitor of protein phosphatase-1 , 1984, Nature.

[26] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .

[27] L. Nowak,et al. Magnesium gates glutamate-activated channels in mouse central neurones , 1984, Nature.

[28] P. Greengard,et al. Mammalian brain phosphoproteins as substrates for calcineurin. , 1984, The Journal of biological chemistry.

[29] C. Gerfen. The neostriatal mosaic: compartmentalization of corticostriatal input and striatonigral output systems , 1984, Nature.

[30] M. Mayer,et al. Voltage-dependent block by Mg2+ of NMDA responses in spinal cord neurones , 1984, Nature.

[31] W. Schultz,et al. Responses of rat pallidum cells to cortex stimulation and effects of altered dopaminergic activity , 1985, Neuroscience.

[32] C. Gerfen. The neostriatal mosaic. I. compartmental organization of projections from the striatum to the substantia nigra in the rat , 1985, The Journal of comparative neurology.

[33] A. Harry Klopf,et al. A drive-reinforcement model of single neuron function , 1987 .

[34] W. Schultz. Responses of midbrain dopamine neurons to behavioral trigger stimuli in the monkey. , 1986, Journal of neurophysiology.

[35] B. Kosco. Differential Hebbian learning , 1987 .

[36] P. Calabresi,et al. Intracellular studies on the dopamine-induced firing inhibition of neostriatal neurons in vitro: Evidence for D1 receptor involvement , 1987, Neuroscience.

[37] B. Gustafsson,et al. Long-term potentiation in the hippocampus using depolarizing current pulses as the conditioning stimulus to single volley synaptic potentials , 1987, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[38] J. Joyce,et al. Quantitative autoradiography of dopamine D2 sites in rat caudate-putamen: Localization to intrinsic neurons and not to neocortical afferents , 1987, Neuroscience.

[39] C. Gerfen,et al. The neostriatal mosaic: II. Patch- and matrix-directed mesostriatal dopaminergic and non-dopaminergic systems , 1987, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[40] Bernard Widrow,et al. Adaptive switching circuits , 1988 .

[41] A. Klopf. A neuronal model of classical conditioning , 1988 .

[42] B. Berger,et al. Regional and laminar distribution of the dopamine and serotonin innervation in the macaque cerebral cortex: A radioautographic study , 1988, The Journal of comparative neurology.

[43] Stephen Grossberg,et al. Art 2: Self-Organization Of Stable Category Recognition Codes For Analog Input Patterns , 1988, Other Conferences.

[44] Stephen Grossberg,et al. Neural dynamics of adaptive timing and temporal discrimination during associative learning , 1989, Neural Networks.

[45] C. Watkins. Learning from delayed rewards , 1989 .

[46] D. McFarland. Problems of animal behaviour , 1989 .

[47] R. Tsien,et al. Inhibition of postsynaptic PKC or CaMKII blocks induction but not expression of LTP. , 1989, Science.

[48] J. Lisman,et al. A mechanism for the Hebb and the anti-Hebb processes underlying learning and memory. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[49] R. Nicoll,et al. An essential role for postsynaptic calmodulin and protein kinase activity in long-term potentiation , 1989, Nature.

[50] C. Altar,et al. Discriminatory roles for D1 and D2 dopamine receptor subtypes in the in vivo control of neostriatal cyclic GMP. , 1990, European journal of pharmacology.

[51] A. Parent. Extrinsic connections of the basal ganglia , 1990, Trends in Neurosciences.

[52] Richard S. Sutton,et al. Time-Derivative Models of Pavlovian Reinforcement , 1990 .

[53] M. Gabriel,et al. Learning and Computational Neuroscience: Foundations of Adaptive Networks , 1990 .

[54] H. Groenewegen,et al. The anatomical relationship of the prefrontal cortex with the striatopallidal system, the thalamus and the amygdala: evidence for a parallel organization. , 1990, Progress in brain research.

[55] Daniel C. Dennett,et al. Cognitive Wheels: The Frame Problem of AI , 1990, The Philosophy of Artificial Intelligence.

[56] A. Grace,et al. Midbrain dopamine system electrophysiological functioning: A review and new hypothesis , 1991, Synapse.

[57] P. Calabresi,et al. Long‐term Potentiation in the Striatum is Unmasked by Removing the Voltage‐dependent Magnesium Block of NMDA Receptor Channels , 1992, The European journal of neuroscience.

[58] W. Schultz,et al. Responses of monkey dopamine neurons during learning of behavioral reactions. , 1992, Journal of neurophysiology.

[59] P. Calabresi,et al. Coactivation of D1 and D2 dopamine receptors is required for long-term synaptic depression in the striatum , 1992, Neuroscience Letters.

[60] G. Tesauro. Practical Issues in Temporal Difference Learning , 1992 .

[61] C. Gerfen. The neostriatal mosaic: multiple levels of compartmental organization in the basal ganglia. , 1992, Annual review of neuroscience.

[62] P. Calabresi,et al. Long-term synaptic depression in the striatum: physiological and pharmacological characterization , 1992, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[63] A. Konnerth,et al. Sodium action potentials in the dendrites of cerebellar Purkinje cells. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[64] W. Schultz,et al. Neuronal activity in monkey ventral striatum related to the expectation of reward , 1992, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[65] W. N. Ross,et al. Imaging voltage and synaptically activated sodium transients in cerebellar Purkinje cells , 1992, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[66] D. Sibley,et al. Molecular biology of dopamine receptors. , 1992, Trends in pharmacological sciences.

[67] D. Lovinger,et al. Short- and long-term synaptic depression in rat neostriatum. , 1993, Journal of neurophysiology.

[68] W. Singer,et al. Long-term depression of excitatory synaptic transmission and its relationship to long-term potentiation , 1993, Trends in Neurosciences.

[69] D. Surmeier,et al. D1 and D2 dopamine receptor modulation of sodium and potassium currents in rat neostriatal neurons. , 1993, Progress in brain research.

[70] P. Goldman-Rakic,et al. Characterization of the dopaminergic innervation of the primate frontal cortex using a dopamine-specific antibody. , 1993, Cerebral cortex.

[71] F. H. Lopes da Silva,et al. Synaptic Plasticity in an In Vitro Slice Preparation of the Rat Nucleus Accumbens , 1993, The European journal of neuroscience.

[72] J. Walsh,et al. Synaptic activation of N-methyl-d-aspartate receptors induces short-term potentiation at excitatory synapses in the striatum of the rat , 1993, Neuroscience.

[73] J. Walsh. Depression of excitatory synaptic input in rat striatal neurons , 1993, Brain Research.

[74] C. Anderson,et al. Multigrid Q-learning , 1994 .

[75] Joel L. Davis,et al. A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .

[76] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[77] S. Haber,et al. Primate striatonigral projections: A comparison of the sensorimotor‐related striatum and the ventral striatum , 1994, The Journal of comparative neurology.

[78] W. Schultz,et al. Importance of unpredictability for reward responses in primate dopamine neurons. , 1994, Journal of neurophysiology.

[79] James C. Houk,et al. Elements of the Intrinsic Organization and Information Processing in the Neostriatum , 1994 .

[80] D. Debanne,et al. Asynchronous pre- and postsynaptic activity induces associative long-term depression in area CA1 of the rat hippocampus in vitro. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[81] B. Sakmann,et al. Active propagation of somatic action potentials into neocortical pyramidal cell dendrites , 1994, Nature.

[82] M. L. Pucak,et al. Regulation of substantia nigra dopamine neurons. , 1994, Critical reviews in neurobiology.

[83] R. Malenka,et al. Involvement of a calcineurin/ inhibitor-1 phosphatase cascade in hippocampal long-term depression , 1994, Nature.

[84] P. Greengard,et al. Modulation of calcium currents by a D1 dopaminergic protein kinase/phosphatase cascade in rat neostriatal neurons , 1995, Neuron.

[85] S Grossberg,et al. A spectral network model of pitch perception. , 1995, The Journal of the Acoustical Society of America.

[86] A. Barto,et al. Adaptive Critics and the Basal Ganglia , 1994 .

[87] Gavin Adrian Rummery. Problem solving with reinforcement learning , 1995 .

[88] Joel L. Davis,et al. Adaptive Critics and the Basal Ganglia , 1995 .

[89] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.

[90] Peter Dayan,et al. Bee foraging in uncertain environments using predictive hebbian learning , 1995, Nature.

[91] P. Calabresi,et al. Transmitter Release Associated with Long‐term Synaptic Depression in Rat Corticostriatal Slices , 1995, The European journal of neuroscience.

[92] G. Buzsáki,et al. Pattern and inhibition-dependent invasion of pyramidal cell dendrites by fast spikes in the hippocampus in vivo. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[93] R. Huganir,et al. Characterization of Multiple Phosphorylation Sites on the AMPA Receptor GluR1 Subunit , 1996, Neuron.

[94] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[95] J. Wickens,et al. Dopamine reverses the depression of rat corticostriatal synapses which normally follows high-frequency stimulation of cortex In vitro , 1996, Neuroscience.

[96] Wulfram Gerstner,et al. A neuronal learning rule for sub-millisecond temporal coding , 1996, Nature.

[97] P. Dayan,et al. A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[98] D. Johnston,et al. Active properties of neuronal dendrites. , 1996, Annual review of neuroscience.

[99] P. Calabresi,et al. The corticostriatal projection: from synaptic plasticity to dysfunctions of the basal ganglia , 1996, Trends in Neurosciences.

[100] S. Grossberg,et al. The Hippocampus and Cerebellum in Adaptively Timed Learning, Recognition, and Movement , 1996, Journal of Cognitive Neuroscience.

[101] Jordan B. Pollack,et al. Why did TD-Gammon Work? , 1996, NIPS.

[102] J. L. Martínez,et al. Long-term potentiation and learning. , 1996, Annual review of psychology.

[103] K. I. Blum,et al. Functional significance of long-term potentiation for sequence learning and prediction. , 1996, Cerebral cortex.

[104] V. Han,et al. Synaptic plasticity in a cerebellum-like structure depends on temporal order , 1997, Nature.

[105] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.

[106] J. Bargas,et al. D1 Receptor Activation Enhances Evoked Discharge in Neostriatal Medium Spiny Neurons by Modulating an L-Type Ca2+ Conductance , 1997, The Journal of Neuroscience.

[107] P. Greengard,et al. Bidirectional Regulation of DARPP-32 Phosphorylation by Dopamine , 1997, The Journal of Neuroscience.

[108] D. Johnston,et al. K+ channel regulation of signal propagation in dendrites of hippocampal pyramidal neurons , 1997, Nature.

[109] B. Morris,et al. Dynamic changes in NADPH-diaphorase staining reflect activity of nitric oxide synthase: Evidence for a dopaminergic regulation of striatal nitric oxide release , 1997, Neuropharmacology.

[110] D. Lovinger,et al. Decreased probability of neurotransmitter release underlies striatal long-term depression and postnatal development of corticostriatal synapses. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[111] S. Hoffman,et al. Funding for malaria genome sequencing , 1997, Nature.

[112] N. Spruston,et al. Action potential initiation and backpropagation in neurons of the mammalian CNS , 1997, Trends in Neurosciences.

[113] P. Overton,et al. Burst firing in midbrain dopaminergic neurons , 1997, Brain Research Reviews.

[114] D. Johnston,et al. A Synaptically Controlled, Associative Signal for Hebbian Plasticity in Hippocampal Neurons , 1997, Science.

[115] H. Sebastian Seung,et al. Learning Continuous Attractors in Recurrent Networks , 1997, NIPS.

[116] R. Huganir,et al. Characterization of Protein Kinase A and Protein Kinase C Phosphorylation of the N-Methyl-D-aspartate Receptor NR1 Subunit Using Phosphorylation Site-specific Antibodies* , 1997, The Journal of Biological Chemistry.

[117] M. Umemiya,et al. Dopaminergic modulation of excitatory postsynaptic currents in rat neostriatal neurons. , 1997, Journal of neurophysiology.

[118] D. Johnston,et al. Regulation of Synaptic Efficacy by Coincidence of Postsynaptic APs and EPSPs , 1997 .

[119] B. Sakmann,et al. Calcium action potentials restricted to distal apical dendrites of rat neocortical pyramidal neurons , 1997, The Journal of physiology.

[120] S. Charpier,et al. In vivo activity-dependent plasticity at cortico-striatal connections: evidence for physiological long-term potentiation. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[121] K. Deisseroth,et al. Translocation of calmodulin to the nucleus supports CREB phosphorylation in hippocampal neurons , 1998, Nature.

[122] W. Schultz,et al. Learning of sequential movements by neural network model with dopamine-like reinforcement signal , 1998, Experimental Brain Research.

[123] P. Greengard,et al. Activation of adenosine A2A and dopamine D1 receptors stimulates cyclic AMP-dependent phosphorylation of DARPP-32 in distinct populations of striatal projection neurons , 1998, Neuroscience.

[124] J. Hollerman,et al. Influence of reward expectation on behavior-related neuronal activity in primate striatum. , 1998, Journal of neurophysiology.

[125] G. Schoenbaum,et al. Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning , 1998, Nature Neuroscience.

[126] J. Hollerman,et al. Dopamine neurons report an error in the temporal prediction of reward during learning , 1998, Nature Neuroscience.

[127] B. Sakmann,et al. Calcium dynamics in single spines during coincident pre- and postsynaptic activity depend on relative timing of back-propagating action potentials and subthreshold excitatory postsynaptic potentials. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[128] M. Berridge. Neuronal Calcium Signaling , 1998, Neuron.

[129] Sen Song,et al. Temporally Asymmetric Hebbian Learning, Spike liming and Neural Response Variability , 1998, NIPS.

[130] T. Sejnowski,et al. A Computational Model of How the Basal Ganglia Produce Sequences , 1998, Journal of Cognitive Neuroscience.

[131] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[132] P. De Koninck,et al. Sensitivity of CaM kinase II to the frequency of Ca2+ oscillations. , 1998, Science.

[133] Li I. Zhang,et al. A critical window for cooperation and competition among developing retinotectal synapses , 1998, Nature.

[134] Christian Balkenius,et al. Computational models of classical conditioning: a comparative study , 1998 .

[135] O. Hikosaka,et al. Expectation of reward modulates cognitive signals in the basal ganglia , 1998, Nature Neuroscience.

[136] J. Hollerman,et al. Modifications of reward expectation-related neuronal activity during learning in primate striatum. , 1998, Journal of neurophysiology.

[137] G. Bi,et al. Synaptic Modifications in Cultured Hippocampal Neurons: Dependence on Spike Timing, Synaptic Strength, and Postsynaptic Cell Type , 1998, The Journal of Neuroscience.

[138] D. Debanne,et al. Long‐term synaptic plasticity between pairs of individual CA3 pyramidal cells in rat hippocampal slice cultures , 1998, The Journal of physiology.

[139] K. Berridge,et al. What is the role of dopamine in reward: hedonic impact, reward learning, or incentive salience? , 1998, Brain Research Reviews.

[140] Jack D. Cowan,et al. DYNAMICS OF SELF-ORGANIZED DELAY ADAPTATION , 1999 .

[141] R. Kempter,et al. Hebbian learning and spiking neurons , 1999 .

[142] D. Linden. The Return of the Spike Postsynaptic Action Potentials and the Induction of LTP and LTD , 1999, Neuron.

[143] Claude Touzet,et al. Dynamic Update of the Reinforcement Function During Learning , 1999, Connect. Sci..

[144] W. Schultz,et al. A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task , 1999, Neuroscience.

[145] C. Frith,et al. Orbitofrontal cortex is activated during breaches of expectation in tasks of visual attention , 1999, Nature Neuroscience.

[146] K. Svoboda,et al. Synaptic [Ca2+] Intracellular Stores Spill Their Guts , 1999, Neuron.

[147] Joshua W. Brown,et al. How the Basal Ganglia Use Parallel Excitatory and Inhibitory Learning Pathways to Selectively Respond to Unexpected Rewarding Cues , 1999, The Journal of Neuroscience.

[148] P. Redgrave,et al. Is the short-latency dopamine response too short to signal reward error? , 1999, Trends in Neurosciences.

[149] O. Paulsen,et al. Rapid report: postsynaptic bursting is essential for 'Hebbian' induction of associative long-term potentiation at excitatory synapses in rat hippocampus. , 1999, The Journal of physiology.

[150] R. Zucker,et al. Selective induction of LTP and LTD by postsynaptic [Ca2+]i elevation. , 1999, Journal of neurophysiology.

[151] P. Calabresi,et al. Glutamate-Triggered Events Inducing Corticostriatal Long-Term Depression , 1999, The Journal of Neuroscience.

[152] S. Charpier,et al. In vivo induction of striatal long-term potentiation by low-frequency stimulation of the cerebral cortex , 1999, Neuroscience.

[153] K. Mikoshiba,et al. Facilitation of NMDAR-Independent LTP and Spatial Learning in Mutant Mice Lacking Ryanodine Receptor Type 3 , 1999, Neuron.

[154] R. Zucker. Calcium- and activity-dependent synaptic plasticity , 1999, Current Opinion in Neurobiology.

[155] K. Deisseroth,et al. L-type calcium channels and GSK-3 regulate the activity of NF-ATc4 in hippocampal neurons , 1999, Nature.

[156] Claude F. Touzet,et al. Neural Networks and Q-Learning for Robotics , 1999 .

[157] H. Kita,et al. Expression of N-methyl-d-aspartate receptor-dependent long-term potentiation in the neostriatal neurons in an in vitro slice after ethanol withdrawal of the rat , 1999, Neuroscience.

[158] P. Greengard,et al. Beyond the Dopamine Receptor: Review the DARPP-32/Protein Phosphatase-1 Cascade , 1999 .

[159] P. Greengard,et al. Phosphorylation of DARPP-32 by Cdk5 modulates dopamine signalling in neurons , 1999, Nature.

[160] R. Nicoll,et al. Long-term potentiation--a decade of progress? , 1999, Science.

[161] W. Schultz,et al. Relative reward preference in primate orbitofrontal cortex , 1999, Nature.

[162] Juan Miguel Santos,et al. Exploration tuned reinforcement function , 1999, Neurocomputing.

[163] B. Sakmann,et al. Coincidence detection and changes of synaptic efficacy in spiny stellate neurons in rat barrel cortex , 1999, Nature Neuroscience.

[164] P. Calabresi,et al. Unilateral dopamine denervation blocks corticostriatal LTP. , 1999, Journal of neurophysiology.

[165] Xiaohui Xie,et al. Spike-based Learning Rules and Stabilization of Persistent Neural Activity , 1999, NIPS.

[166] Richard S. Sutton,et al. Open Theoretical Questions in Reinforcement Learning , 1999, EuroCOLT.

[167] M. Bennett,et al. The concept of long term potentiation of transmission at synapses , 2000, Progress in Neurobiology.

[168] Mark C. W. van Rossum,et al. Stable Hebbian Learning from Spike Timing-Dependent Plasticity , 2000, The Journal of Neuroscience.

[169] A. Dickinson,et al. Neuronal coding of prediction errors. , 2000, Annual review of neuroscience.

[170] N. Spruston,et al. Diversity and dynamics of dendritic signaling. , 2000, Science.

[171] V. Han,et al. Reversible Associative Depression and Nonassociative Potentiation at a Parallel Fiber Synapse , 2000, Neuron.

[172] G. Akopian,et al. Functional state of corticostriatal synapses determines their expression of short‐ and long‐term plasticity , 2000, Synapse.

[173] J. Spencer,et al. Bi-directional changes in synaptic plasticity induced at corticostriatal synapses in vitro , 2000, Experimental Brain Research.

[174] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[175] L. Abbott,et al. Competitive Hebbian learning through spike-timing-dependent synaptic plasticity , 2000, Nature Neuroscience.

[176] M. Poo,et al. Calcium stores regulate the polarity and input specificity of synaptic modification , 2000, Nature.

[177] S. Kakade,et al. Learning and selective attention , 2000, Nature Neuroscience.

[178] J. Leo van Hemmen,et al. Modeling Synaptic Plasticity in Conjunction with the Timing of Pre- and Postsynaptic Action Potentials , 2000, Neural Computation.

[179] R. Malenka,et al. Dopaminergic modulation of neuronal excitability in the striatum and nucleus accumbens. , 2000, Annual review of neuroscience.

[180] S. J. Martin,et al. Synaptic plasticity and memory: an evaluation of the hypothesis. , 2000, Annual review of neuroscience.

[181] Karl J. Friston,et al. Dissociable Neural Responses in Human Reward Systems , 2000, The Journal of Neuroscience.

[182] D. Feldman,et al. Timing-Based LTP and LTD at Vertical Inputs to Layer II/III Pyramidal Cells in Rat Barrel Cortex , 2000, Neuron.

[183] W. Hauber,et al. Nmda, but Not Dopamine D 2 , Receptors in the Rat Nucleus Accumbens Are Involved in Guidance of Instrumental Behavior by Stimuli Predicting Reward Magnitude , 2022 .

[184] D. Joel,et al. The connections of the dopaminergic system with the striatum in rats and primates: an analysis with respect to the functional and compartmental organization of the striatum , 2000, Neuroscience.

[185] Nikolaus R. McFarland,et al. Striatonigrostriatal Pathways in Primates Form an Ascending Spiral from the Shell to the Dorsolateral Striatum , 2000, The Journal of Neuroscience.

[186] P. Greengard,et al. Dopamine and cAMP-Regulated Phosphoprotein 32 kDa Controls Both Striatal Long-Term Depression and Long-Term Potentiation, Opposing Forms of Synaptic Plasticity , 2000, The Journal of Neuroscience.

[187] J. Partridge,et al. Regional and postnatal heterogeneity of activity-dependent long-term changes in synaptic efficacy in the dorsal striatum. , 2000, Journal of neurophysiology.

[188] L. Nystrom,et al. Tracking the hemodynamic responses to reward and punishment in the striatum. , 2000, Journal of neurophysiology.

[189] Samuel M. McClure,et al. Predictability Modulates Human Brain Response to Reward , 2001, The Journal of Neuroscience.

[190] Henry Markram,et al. An Algorithm for Modifying Neurotransmitter Release Probability Based on Pre- and Postsynaptic Spike Timing , 2001, Neural Computation.

[191] W. Schultz,et al. Influence of expectation of different rewards on behavior-related neuronal activity in the striatum. , 2001, Journal of neurophysiology.

[192] Y. Dan,et al. Stimulus Timing-Dependent Plasticity in Cortical Processing of Orientation , 2001, Neuron.

[193] P. Calabresi,et al. Dopaminergic control of synaptic plasticity in the dorsal striatum , 2001, The European journal of neuroscience.

[194] Daniel D. Lee,et al. Equilibrium properties of temporally asymmetric Hebbian plasticity. , 2000, Physical review letters.

[195] L. Abbott,et al. Cortical Development and Remapping through Spike Timing-Dependent Plasticity , 2001, Neuron.

[196] R. Kempter,et al. Temporal map formation in the barn owl's brain. , 2001, Physical review letters.

[197] D. Kahneman,et al. Functional Imaging of Neural Responses to Expectancy and Experience of Monetary Gains and Losses tasks with monetary payoffs , 2001 .

[198] A. West,et al. Calcium regulation of neuronal gene expression , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[199] K. Tang,et al. Dopamine-dependent synaptic plasticity in striatum during in vivo development. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[200] Nace L. Golding,et al. Compartmental Models Simulating a Dichotomy of Action Potential Backpropagation in Ca1 Pyramidal Neuron Dendrites , 2001, Journal of neurophysiology.

[201] Wulfram Gerstner,et al. Intrinsic Stabilization of Output Rates by Spike-Based Hebbian Learning , 2001, Neural Computation.

[202] Rajesh P. N. Rao,et al. Spike-Timing-Dependent Hebbian Plasticity as Temporal Difference Learning , 2001, Neural Computation.

[203] Isaac Meilijson,et al. Distributed synchrony in a cell assembly of spiking neurons , 2001, Neural Networks.

[204] P. J. Sjöström,et al. Rate, Timing, and Cooperativity Jointly Determine Cortical Synaptic Plasticity , 2001, Neuron.

[205] Brian Knutson,et al. Dissociation of reward anticipation and outcome with event-related fMRI , 2001, Neuroreport.

[206] R. Kempter,et al. Formation of temporal-feature maps by axonal propagation of synaptic learning , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[207] M. Arbib,et al. Modeling functions of striatal dopamine modulation in learning and planning , 2001, Neuroscience.

[208] J. Leo van Hemmen,et al. Temporal receptive fields, spikes, and Hebbian delay selection , 2001, Neural Networks.

[209] Richard Withey. The convergence of convergence , 2001, Aslib Proc..

[210] C. Lüscher,et al. Restless AMPA receptors: implications for synaptic transmission and plasticity , 2001, Trends in Neurosciences.

[211] G. Bi,et al. Synaptic modification by correlated activity: Hebb's postulate revisited. , 2001, Annual review of neuroscience.

[212] R. Dolmetsch,et al. Signaling to the Nucleus by an L-type Calcium Channel-Calmodulin Complex Through the MAP Kinase Pathway , 2001, Science.

[213] L. Cooper,et al. A biophysical model of bidirectional synaptic plasticity: Dependence on AMPA and NMDA receptors , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[214] A. Konnerth,et al. Stores Not Just for Storage Intracellular Calcium Release and Synaptic Plasticity , 2001, Neuron.

[215] J. Hemmen. Chapter 18 Theory of synaptic plasticity , 2001 .

[216] E. Rolls,et al. Representation of pleasant and aversive taste in the human brain. , 2001, Journal of neurophysiology.

[217] Roland E. Suri,et al. Temporal Difference Model Reproduces Anticipatory Neural Activity , 2001, Neural Computation.

[218] Stuart I. Reynolds. Reinforcement Learning with Exploration , 2002 .

[219] I. Song,et al. Regulation of AMPA receptors during synaptic plasticity , 2002, Trends in Neurosciences.

[220] Guo-Qiang Bi,et al. Spatiotemporal specificity of synaptic plasticity: cellular rules and mechanisms , 2002, Biological Cybernetics.

[221] J. O'Doherty,et al. Neural Responses during Anticipation of a Primary Taste Reward , 2002, Neuron.

[222] David C. Sterratt,et al. Does Morphology Influence Temporal Plasticity? , 2002, ICANN.

[223] W. Schultz. Getting Formal with Dopamine and Reward , 2002, Neuron.

[224] P. Dayan,et al. Reward, Motivation, and Reinforcement Learning , 2002, Neuron.

[225] Dean V. Buonomano,et al. Mechanisms and significance of spike-timing dependent plasticity , 2002, Biological Cybernetics.

[226] B. Porr,et al. Isotropic sequence order learning using a novel linear algorithm in a closed loop behavioural system. , 2002, Bio Systems.

[227] Werner M. Kistler,et al. Spike-timing dependent synaptic plasticity: a phenomenological framework , 2002, Biological Cybernetics.

[228] Nace L. Golding,et al. Dendritic spikes as a mechanism for cooperative long-term potentiation , 2002, Nature.

[229] Patrick D. Roberts,et al. Spike timing dependent synaptic plasticity in biological systems , 2002, Biological Cybernetics.

[230] Eytan Ruppin,et al. Actor-critic models of the basal ganglia: new anatomical and computational perspectives , 2002, Neural Networks.

[231] Katsunori Kitano,et al. An accurate and widely applicable method to determine the distribution of synaptic strengths formed by the spike-timing-dependent learning , 2002, Neurocomputing.

[232] Wulfram Gerstner,et al. Mathematical formulations of Hebbian learning , 2002, Biological Cybernetics.

[233] L. Cooper,et al. A unified model of NMDA receptor-dependent bidirectional synaptic plasticity , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[234] K. Holthoff,et al. A problem with Hebb and local spikes , 2002, Trends in Neurosciences.

[235] P. Montague,et al. Activity in human ventral striatum locked to errors of reward prediction , 2002, Nature Neuroscience.

[236] J. Leo van Hemmen,et al. Mapping time , 2002, Biological Cybernetics.

[237] M. R. Mehta,et al. Role of experience and oscillations in transforming a rate code into a temporal code , 2002, Nature.

[238] John N. J. Reynolds,et al. Dopamine-dependent plasticity of corticostriatal synapses , 2002, Neural Networks.

[239] P. Dayan. Matters temporal , 2002, Trends in Cognitive Sciences.

[240] Y. Dan,et al. Spike-timing-dependent synaptic modification induced by natural spike trains , 2002, Nature.

[241] U. Karmarkar,et al. A model of spike-timing dependent plasticity: one or two coincidence detectors? , 2002, Journal of neurophysiology.

[242] H. Abarbanel,et al. Dynamical model of long-term synaptic plasticity , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[243] Stefano Fusi,et al. Hebbian spike-driven synaptic plasticity for learning patterns of mean firing rates , 2002, Biological Cybernetics.

[244] Stefan Wermter,et al. Spike-Timing Dependent Competitive Learning of Integrate-and-Fire Neurons with Active Dendrites , 2002, ICANN.

[245] R. Palmiter,et al. Reward without Dopamine , 2003, The Journal of Neuroscience.

[246] Florentin Wörgötter,et al. Isotropic-sequence-order learning in a closed-loop behavioural system , 2003, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[247] Florentin Wörgötter,et al. Isotropic Sequence Order Learning , 2003, Neural Computation.

[248] Katsunori Kitano,et al. Time representing cortical activities: two models inspired by prefrontal persistent activity , 2003, Biological Cybernetics.

[249] Florentin Wörgötter,et al. ISO Learning Approximates a Solution to the Inverse-Controller Problem in an Unsupervised Behavioral Paradigm , 2003, Neural Computation.

[250] J. Houk,et al. Modulation of striatal single units by expected reward: a spiny neuron model displaying dopamine-induced bistability. , 2003, Journal of neurophysiology.

[251] Karl J. Friston,et al. Temporal Difference Models and Reward-Related Learning in the Human Brain , 2003, Neuron.

[252] Ramón Huerta,et al. Biophysical model of synaptic plasticity dynamics , 2003, Biological Cybernetics.

[253] Eugene M. Izhikevich,et al. Relating STDP to BCM , 2003, Neural Computation.

[254] Naoyuki Sato,et al. Memory Encoding by Theta Phase Precession in the Hippocampal Network , 2003, Neural Computation.

[255] N. Daw,et al. Reinforcement learning models of the dopamine system and their behavioral implications , 2003 .

[256] Haim Sompolinsky,et al. Learning Input Correlations through Nonlinear Temporally Asymmetric Hebbian Plasticity , 2003, The Journal of Neuroscience.

[257] G. Pagnoni,et al. Human Striatal Response to Salient Nonrewarding Stimuli , 2003, The Journal of Neuroscience.

[258] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[259] Michele Migliore,et al. Role of an A-Type K+ Conductance in the Back-Propagation of Action Potentials in the Dendrites of Hippocampal Pyramidal Neurons , 1999, Journal of Computational Neuroscience.

[260] Peter Dayan,et al. The convergence of TD(λ) for general λ , 1992, Machine Learning.

[261] J. Bargas,et al. Inhibitory action of dopamine involves a subthreshold Cs+-sensitive conductance in neostriatal neurons , 1996, Experimental Brain Research.

[262] H. Markram,et al. Coding and learning of behavioral sequences , 2004, Trends in Neurosciences.

[263] Florentin Wörgötter,et al. How the Shape of Pre- and Postsynaptic Signals Can Influence STDP: A Biophysical Model , 2004, Neural Computation.

[264] Richard S. Sutton,et al. Reinforcement learning with replacing eligibility traces , 2004, Machine Learning.

[265] José Luis Contreras-Vidal,et al. A Predictive Reinforcement Model of Dopamine Neurons for Learning Approach Behavior , 1999, Journal of Computational Neuroscience.

[266] Patrick D. Roberts,et al. Computational Consequences of Temporally Asymmetric Learning Rules: I. Differential Hebbian Learning , 1999, Journal of Computational Neuroscience.

[267] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.

[268] M. Mehta. Cooperative LTP can map memory sequences on dendritic branches , 2004, Trends in Neurosciences.

[269] Walter Senn,et al. Spike-Based Synaptic Plasticity and the Emergence of Direction Selective Simple Cells: Simulation Results , 2002, Journal of Computational Neuroscience.

[270] P. Dayan,et al. TD(λ) converges with probability 1 , 2004, Machine Learning.

[271] Patrick D. Roberts,et al. Computational Consequences of Temporally Asymmetric Learning Rules: II. Sensory Image Cancellation , 2000, Journal of Computational Neuroscience.

[272] S. Wise,et al. Premotor and supplementary motor cortex in rhesus monkeys: neuronal activity during externally- and internally-instructed motor tasks , 2004, Experimental Brain Research.

[273] V. Russell,et al. Regional distribution of monoamines and dopamine D1-and D2-receptors in the striatum of the rat , 1992, Neurochemical Research.

[274] Jürgen Schmidhuber,et al. Fast Online Q(λ) , 1998, Machine Learning.

[275] Walter Senn,et al. Spike-Based Synaptic Plasticity and the Emergence of Direction Selective Simple Cells: Mathematical Analysis , 2003, Journal of Computational Neuroscience.

[276] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.

[277] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 2005, IEEE Transactions on Neural Networks.

[278] T. Bliss. Long-lasting potentiation of synaptic transmission , 2005 .

[279] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[280] Mark D. Humphries,et al. A robot model of the basal ganglia: Behavior and intrinsic processing , 2006, Neural Networks.

[281] K. Weigmann. Robots emulating children , 2006 .

[282] Thomas P. Trappenberg,et al. Rapid learning and robust recall of long sequences in modular associator networks , 2006, Neurocomputing.

[283] Ramón Huerta,et al. Generation and reshaping of sequences in neural systems , 2006, Biological Cybernetics.

[284] Florentin Wörgötter,et al. Strongly Improved Stability and Faster Convergence of Temporal Sequence Learning by Using Input Correlations Only , 2006, Neural Computation.

[285] Norbert Krüger,et al. Symbols as Self-emergent Entities in an Optimization Process of Feature Extraction and Predictions , 2006, Biological Cybernetics.

[286] Friedemann Pulvermüller,et al. Language models based on Hebbian cell assemblies , 2006, Journal of Physiology-Paris.

[287] H. Sompolinsky,et al. The tempotron: a neuron that learns spike timing–based decisions , 2006, Nature Neuroscience.

[288] Fred Cummins,et al. Modeling dopamine activity by Reinforcement Learning methods: implications from two recent models , 2006, Artificial Intelligence Review.