NeMo: A Massively Parallel Discrete-Event Simulation Model for Neuromorphic Architectures

Neuromorphic computing is a broad category of non–von Neumann architectures that mimic biological nervous systems using hardware. Current research shows that this class of computing can execute data classification algorithms using only a tiny fraction of the power conventional CPUs require. This raises the larger research question: How might neuromorphic computing be used to improve application performance, power consumption, and overall system reliability of future supercomputers? To address this question, an open-source neuromorphic processor architecture simulator called NeMo is being developed. This effort will enable the design space exploration of potential heterogeneous compute systems that combine traditional CPUs, GPUs, and neuromorphic hardware. This article examines the design, implementation, and performance of NeMo. Demonstration of NeMo’s efficient execution using 2,048 nodes of an IBM Blue Gene/Q system, modeling 8,388,608 neuromorphic processing cores is reported. The peak performance of NeMo is just over ten billion events-per-second when operating at this scale.

[1]  Christopher D. Carothers,et al.  Efficient optimistic parallel simulations using reverse computation , 1999, Proceedings Thirteenth Workshop on Parallel and Distributed Simulation. PADS 99. (Cat. No.PR00155).

[2]  Andrew S. Cassidy,et al.  Cognitive computing systems: Algorithms and applications for networks of neurosynaptic cores , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[3]  Eugene M. Izhikevich,et al.  Which model to use for cortical spiking neurons? , 2004, IEEE Transactions on Neural Networks.

[4]  Philip Heidelberger,et al.  The IBM Blue Gene/Q interconnection network and message unit , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[5]  Yi-Bing Lin,et al.  A study of time warp rollback mechanisms , 1991, TOMC.

[6]  R.M. Fujimoto,et al.  Parallel and distributed simulation systems , 2001, Proceeding of the 2001 Winter Simulation Conference (Cat. No.01CH37304).

[7]  David M. Nicol,et al.  Parallel execution for serial simulators , 1996, TOMC.

[8]  Murray Shanahan,et al.  NeMo: A Platform for Neural Modelling of Spiking Neurons Using GPUs , 2009, 2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors.

[9]  Michael L. Hines,et al.  Parallel network simulations with NEURON , 2006, Journal of Computational Neuroscience.

[10]  Eugene M. Izhikevich,et al.  Resonate-and-fire neurons , 2001, Neural Networks.

[11]  Gert Cauwenberghs,et al.  Neuromorphic Silicon Neuron Circuits , 2011, Front. Neurosci.

[12]  William Gropp,et al.  CFD Vision 2030 Study: A Path to Revolutionary Computational Aerosciences , 2014 .

[13]  Andrew S. Cassidy,et al.  Cognitive computing programming paradigm: A Corelet Language for composing networks of neurosynaptic cores , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[14]  Richard M. Fujimoto,et al.  Computing global virtual time in shared-memory multiprocessors , 1997, TOMC.

[15]  Pierre Yger,et al.  PyNN: A Common Interface for Neuronal Network Simulators , 2008, Front. Neuroinform..

[16]  Bernard Brezzo,et al.  TrueNorth: Design and Tool Flow of a 65 mW 1 Million Neuron Programmable Neurosynaptic Chip , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[17]  Andrew S. Cassidy,et al.  Real-Time Scalable Cortical Computing at 46 Giga-Synaptic OPS/Watt with ~100× Speedup in Time-to-Solution and ~100,000× Reduction in Energy-to-Solution , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[18]  Christopher D. Carothers,et al.  Warp speed: executing time warp on 1,966,080 cores , 2013, SIGSIM-PADS.

[19]  Jennifer Hasler,et al.  Finding a roadmap to achieve large neuromorphic hardware systems , 2013, Front. Neurosci..

[20]  Steve B. Furber,et al.  The SpiNNaker Project , 2014, Proceedings of the IEEE.

[21]  Christopher D. Carothers,et al.  Modeling Large Scale Circuits Using Massively Parallel Discrete-Event Simulation , 2012, 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[22]  Christopher D. Carothers,et al.  NeMo: A Massively Parallel Discrete-Event Simulation Model for Neuromorphic Architectures , 2016, SIGSIM-PADS.

[23]  Michael Gschwind,et al.  The IBM Blue Gene/Q Compute Chip , 2012, IEEE Micro.

[24]  Andrew S. Cassidy,et al.  Cognitive computing building block: A versatile and efficient digital neuron model for neurosynaptic cores , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[25]  Christopher D. Carothers,et al.  Analysis of time warp on a 32,768 processor ibm blue Gene/L supercomputer , 2008 .

[26]  Christopher D. Carothers,et al.  Scalable Time Warp on Blue Gene Supercomputers , 2009, 2009 ACM/IEEE/SCS 23rd Workshop on Principles of Advanced and Distributed Simulation.

[27]  Murat Yuksel,et al.  Seven-O'Clock: a new distributed GVT algorithm using network atomic operations , 2005, Workshop on Principles of Advanced and Distributed Simulation (PADS'05).

[28]  Dharmendra S. Modha,et al.  Backpropagation for Energy-Efficient Neuromorphic Computing , 2015, NIPS.

[29]  Nicholas T. Carnevale,et al.  Simulation of networks of spiking neurons: A review of tools and strategies , 2006, Journal of Computational Neuroscience.

[30]  David R. Jefferson,et al.  Virtual time , 1985, ICPP.

[31]  Richard M. Fujimoto,et al.  Parallel event-driven neural network simulations using the Hodgkin-Huxley neuron model , 2005, Workshop on Principles of Advanced and Distributed Simulation (PADS'05).

[32]  Myron Flickner,et al.  Compass: A scalable simulator for an architecture for cognitive computing , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[33]  David M. Nicol,et al.  The cost of conservative synchronization in parallel discrete event simulations , 1993, JACM.

[34]  Alessandro Curioni,et al.  Rebasing I/O for Scientific Computing: Leveraging Storage Class Memory in an IBM BlueGene/Q Supercomputer , 2014, ISC.

[35]  Eugene M. Izhikevich,et al.  Simple model of spiking neurons , 2003, IEEE Trans. Neural Networks.

[36]  Christopher D. Carothers,et al.  ROSS: a high-performance, low memory, modular time warp system , 2000, Proceedings Fourteenth Workshop on Parallel and Distributed Simulation.

[37]  Michael L. Hines,et al.  NTW-MT: a Multi-threaded Simulator for Reaction Diffusion Simulations in NEURON , 2015, SIGSIM-PADS.

[38]  Nikil D. Dutt,et al.  GPGPU accelerated simulation and parameter tuning for neuromorphic applications , 2014, 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC).

[39]  Robert B. Ross,et al.  Visual Data-Analytics of Large-Scale Parallel Discrete-Event Simulations , 2016, 2016 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS).