Using Games to Embody Spiking Neural Networks for Neuromorphic Hardware

Adding value to action-selection through reinforcement-learning provides a mechanism for modifying future decisions of real and artificial entities. This behavioral-level modulation is vital for performing in complex and dynamic environments. In this paper we focus on three classes of biologically inspired feed-forward spiking neural networks capable of action-selection via reinforcement-learning embodied in a minimal virtual agent. Their ability to learn two simple games through reinforcement and punishment is explored under varying levels of noise and feedback. There is no bias or understanding of the task inherent to the networks and all of the dynamics emerge based on environmental interactions. Value of an action takes the form of reinforcement and punishment signals assumed to be provided by the environment or a user. The variation in the four classes arises from different levels of network complexity based on differences in network architecture, the nature of network interactions including the interplay between excitation, inhibition and reinforcement, and the degree of bio-fidelity of the model. A novel aspect of these models is that they obey the constraints of neuromorphic hardware that are currently under development, including the DARPA SyNAPSE neuromorphic chips for very low power spiking model implementations. The simulation results demonstrate the performance of these models for a variant of classic pong as well as a first-person view selection task. Embodying models in games allows for the creation of environments with varying levels of detail that are ideal for testing spiking neural networks. In addition, the performance results suggest that these models could serve as building blocks for the control of more complex robotic systems

[1]  L. Abbott,et al.  Competitive Hebbian learning through spike-timing-dependent synaptic plasticity , 2000, Nature Neuroscience.

[2]  Razvan V. Florian Spiking Neural Controllers for Pushing Objects Around , 2006, SAB.

[3]  Harald Burgsteiner,et al.  Imitation learning with spiking neural networks and real-world devices , 2006, Eng. Appl. Artif. Intell..

[4]  Anthony N. Burkitt,et al.  A Review of the Integrate-and-fire Neuron Model: I. Homogeneous Synaptic Input , 2006, Biological Cybernetics.

[5]  Razvan V. Florian,et al.  Reinforcement Learning Through Modulation of Spike-Timing-Dependent Synaptic Plasticity , 2007, Neural Computation.

[6]  Piotr Dudek,et al.  Implementation of multi-layer leaky integrator networks on a cellular processor array , 2007, 2007 International Joint Conference on Neural Networks.

[7]  Julian Togelius,et al.  Hierarchical controller learning in a First-Person Shooter , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[8]  Michael X. Cohen,et al.  Neurocomputational models of basal ganglia function in learning, memory and choice , 2009, Behavioural Brain Research.

[9]  Luigi Fortuna,et al.  Learning Anticipation via Spiking Networks: Application to Navigation Control , 2009, IEEE Transactions on Neural Networks.

[10]  David Ball,et al.  Spike-Time Robotics: A Rapid Response Circuit for a Robot that Seeks Temporally Varying Stimuli , 2010, Aust. J. Intell. Inf. Process. Syst..

[11]  V. Srinivasa Chakravarthy,et al.  What do the basal ganglia do? A modeling perspective , 2010, Biological Cybernetics.

[12]  Risto Miikkulainen,et al.  Evolving agent behavior in multiobjective domains using fitness-based shaping , 2010, GECCO '10.

[13]  Dharmendra S. Modha,et al.  A digital neurosynaptic core using embedded crossbar memory with 45pJ per spike in 45nm , 2011, 2011 IEEE Custom Integrated Circuits Conference (CICC).

[14]  Walter Senn,et al.  Spatio-Temporal Credit Assignment in Neuronal Population Learning , 2011, PLoS Comput. Biol..

[15]  Narayan Srinivasa,et al.  Self-Organizing Spiking Neural Model for Learning Fault-Tolerant Spatio-Motor Transformations , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[16]  钟粤 A building block , 2012 .

[17]  Narayan Srinivasa,et al.  Neuromorphic Adaptive Plastic Scalable Electronics: Analog Learning Systems , 2012, IEEE Pulse.

[18]  Narayan Srinivasa,et al.  Using a hybrid neuron in physiologically inspired models of the basal ganglia , 2013, Front. Comput. Neurosci..

[19]  Narayan Srinivasa,et al.  Stable learning of functional maps in self-organizing spiking neural networks with continuous synaptic plasticity , 2013, Front. Comput. Neurosci..

[20]  Narayan Srinivasa,et al.  A Spiking Neural Model for Stable Reinforcement of Synapses Based on Multiple Distal Rewards , 2013, Neural Computation.

[21]  Narayan Srinivasa,et al.  HRLSim: A High Performance Spiking Neural Network Simulator for GPGPU Clusters , 2014, IEEE Transactions on Neural Networks and Learning Systems.