Synaptic plasticity model of a spiking neural network for reinforcement learning

This paper presents a reward-related synaptic modification method of a spiking neuron model. The proposed algorithm determines which synapse is eligible for reinforcement by a reward signal. According to the proposed algorithm, a synapse is determined to be eligible when a presynaptic spike occurs shortly before a postsynaptic spike. A pre- and postsynaptic spike correlator (PPSC) is defined and used to determine synaptic eligibility, and to modify synaptic efficacy in cooperation with a reward signal. A simulation is conducted to demonstrate how the interaction between the PPSC and the reward signal influences synaptic plasticity.

[1]  T. Aosaki,et al.  Dopamine-Dependent Synaptic Plasticity in the Striatal Cholinergic Interneurons , 2001, The Journal of Neuroscience.

[2]  Dearborn Animal Intelligence: An Experimental Study of the Associative Processes in Animals , 1900 .

[3]  James L Olds,et al.  Positive reinforcement produced by electrical stimulation of septal area and other regions of rat brain. , 1954, Journal of comparative and physiological psychology.

[4]  J. Lisman,et al.  The molecular basis of CaMKII function in synaptic and behavioural memory , 2002, Nature Reviews Neuroscience.

[5]  U. Bhalla,et al.  Emergent properties of networks of biological signaling pathways. , 1999, Science.

[6]  Henry Markram,et al.  An Algorithm for Modifying Neurotransmitter Release Probability Based on Pre- and Postsynaptic Spike Timing , 2001, Neural Computation.

[7]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[8]  T. Jay Dopamine: a potential substrate for synaptic plasticity and memory mechanisms , 2003, Progress in Neurobiology.

[9]  John N. J. Reynolds,et al.  Dopamine-dependent plasticity of corticostriatal synapses , 2002, Neural Networks.

[10]  L. Cooper,et al.  A unified model of NMDA receptor-dependent bidirectional synaptic plasticity , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Leon N. Cooper,et al.  Calcium as the associative signal for a model of Hebbian plasticity: application to multi-input environments , 2003, Neurocomputing.

[12]  David S. Touretzky,et al.  Long-Term Reward Prediction in TD Models of the Dopamine System , 2002, Neural Computation.

[13]  Joel L. Davis,et al.  Adaptive Critics and the Basal Ganglia , 1995 .

[14]  Gastone C. Castellani,et al.  Converging evidence for a simplified biophysical model of synaptic plasticity , 2002, Biological Cybernetics.

[15]  W. Schultz Getting Formal with Dopamine and Reward , 2002, Neuron.

[16]  Carson C. Chow,et al.  Calcium time course as a signal for spike-timing-dependent plasticity. , 2005, Journal of neurophysiology.

[17]  R. Nicoll,et al.  Ca2+ Signaling Requirements for Long-Term Depression in the Hippocampus , 1996, Neuron.

[18]  H. Seung,et al.  Learning in Spiking Neural Networks by Reinforcement of Stochastic Synaptic Transmission , 2003, Neuron.

[19]  A. C. Greenwood,et al.  Bidirectional synaptic plasticity correlated with the magnitude of dendritic calcium transients above a threshold. , 2001, Journal of neurophysiology.

[20]  J. Wickens,et al.  Cellular models of reinforcement. , 1995 .

[21]  Kenji Doya,et al.  Metalearning and neuromodulation , 2002, Neural Networks.

[22]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23]  Jean-Pascal Pfister,et al.  Optimal Spike-Timing-Dependent Plasticity for Precise Action Potential Firing in Supervised Learning , 2005, Neural Computation.

[24]  R. Zucker,et al.  Selective induction of LTP and LTD by postsynaptic [Ca2+]i elevation. , 1999, Journal of neurophysiology.