Boosting Throughput and Efficiency of Hardware Spiking Neural Accelerators Using Time Compression Supporting Multiple Spike Codes

Spiking neural networks (SNNs) are the third generation of neural networks and can explore both rate and temporal coding for energy-efficient event-driven computation. However, the decision accuracy of existing SNN designs is contingent upon processing a large number of spikes over a long period. Nevertheless, the switching power of SNN hardware accelerators is proportional to the number of spikes processed while the length of spike trains limits throughput and static power efficiency. This paper presents the first study on developing temporal compression to significantly boost throughput and reduce energy dissipation of digital hardware SNN accelerators while being applicable to multiple spike codes. The proposed compression architectures consist of low-cost input spike compression units, novel input-and-output-weighted spiking neurons, and reconfigurable time constant scaling to support large and flexible time compression ratios. Our compression architectures can be transparently applied to any given pre-designed SNNs employing either rate or temporal codes while incurring minimal modification of the neural models, learning algorithms, and hardware design. Using spiking speech and image recognition datasets, we demonstrate the feasibility of supporting large time compression ratios of up to 16×, delivering up to 15.93×, 13.88×, and 86.21× improvements in throughput, energy dissipation, the tradeoffs between hardware area, runtime, energy, and classification accuracy, respectively based on different spike codes on a Xilinx Zynq-7000 FPGA. These results are achieved while incurring little extra hardware overhead.

[1]  E. Izhikevich Resonance and selective communication via bursts in neurons having subthreshold oscillations. , 2002, Bio Systems.

[2]  Max Welling,et al.  Temporally Efficient Deep Learning with Spikes , 2018, ICLR.

[3]  Andrew S. Cassidy,et al.  A million spiking-neuron integrated circuit with a scalable communication network and interface , 2014, Science.

[4]  Kiyoung Choi,et al.  Deep neural networks with weighted spikes , 2018, Neurocomputing.

[5]  Richard F. Lyon,et al.  A computational model of filtering, detection, and compression in the cochlea , 1982, ICASSP.

[6]  Sungroh Yoon,et al.  Fast and Efficient Information Transmission with Burst Spikes in Deep Spiking Neural Networks , 2018, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[7]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[8]  Yu Liu,et al.  Spike-Train Level Direct Feedback Alignment: Sidestepping Backpropagation for On-Chip Training of Spiking Neural Nets , 2020, Frontiers in Neuroscience.

[9]  Steve B. Furber,et al.  The SpiNNaker Project , 2014, Proceedings of the IEEE.

[10]  Liam McDaid,et al.  Scalable Hierarchical Network-on-Chip Architecture for Spiking Neural Network Hardware Implementations , 2013, IEEE Transactions on Parallel and Distributed Systems.

[11]  Simon J. Thorpe,et al.  Spike arrival times: A highly efficient coding scheme for neural networks , 1990 .

[12]  Qian Wang,et al.  Liquid state machine based pattern recognition on FPGA with firing-activity dependent power gating and approximate computing , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[13]  Arnaud Delorme,et al.  Spike-based strategies for rapid processing , 2001, Neural Networks.

[14]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Marcelo A. Montemurro,et al.  Spike-Phase Coding Boosts and Stabilizes Information Carried by Spatial and Temporal Spike Patterns , 2009, Neuron.

[16]  Tobi Delbruck,et al.  Feature Representations for Neuromorphic Audio Spike Streams , 2018, Front. Neurosci..

[17]  Hong Wang,et al.  Loihi: A Neuromorphic Manycore Processor with On-Chip Learning , 2018, IEEE Micro.

[18]  Wulfram Gerstner,et al.  SPIKING NEURON MODELS Single Neurons , Populations , Plasticity , 2002 .

[19]  Davide Zambrano,et al.  Sparse Computation in Adaptive Spiking Neural Networks , 2019, Front. Neurosci..

[20]  Sander M. Bohte,et al.  Fast and Efficient Asynchronous Neural Computation with Adapting Spiking Neural Networks , 2016, ArXiv.

[21]  Filip Ponulak,et al.  Introduction to spiking neural networks: Information processing, learning and applications. , 2011, Acta neurobiologiae experimentalis.

[22]  Henry Markram,et al.  Real-Time Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations , 2002, Neural Computation.

[23]  B. Schrauwen,et al.  BSA, a fast and accurate spike train encoding scheme , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[24]  Yong Zhang,et al.  A Digital Liquid State Machine With Biologically Inspired Learning and Its Application to Speech Recognition , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Sander M. Bohte,et al.  Efficient Computation in Adaptive Artificial Spiking Neural Networks , 2017, ArXiv.

[26]  Sander M. Bohte,et al.  Efficient Spike-Coding with Multiplicative Adaptation in a Spike Response Model , 2012, NIPS.