论文信息 - Revisiting Batch Normalization for Training Low-latency Deep Spiking Neural Networks from Scratch

Revisiting Batch Normalization for Training Low-latency Deep Spiking Neural Networks from Scratch

Spiking Neural Networks (SNNs) have recently emerged as an alternative to deep learning owing to sparse, asynchronous and binary event (or spike) driven processing, that can yield huge energy efficiency benefits on neuromorphic hardware. However, training high-accuracy and low-latency SNNs from scratch suffers from non-differentiable nature of a spiking neuron. To address this training issue in SNNs, we revisit batch normalization and propose a temporal Batch Normalization Through Time (BNTT) technique. Most prior SNN works till now have disregarded batch normalization deeming it ineffective for training temporal SNNs. Different from previous works, our proposed BNTT decouples the parameters in a BNTT layer along the time axis to capture the temporal dynamics of spikes. The temporally evolving learnable parameters in BNTT allow a neuron to control its spike rate through different time-steps, enabling low-latency and low-energy training from scratch. We conduct experiments on CIFAR-10, CIFAR-100, Tiny-ImageNet and event-driven DVS-CIFAR10 datasets. BNTT allows us to train deep SNN architectures from scratch, for the first time, on complex datasets with just few 25-30 time-steps. We also propose an early exit algorithm using the distribution of parameters in BNTT to reduce the latency at inference, that further improves the energy-efficiency.

Priyadarshini Panda | Youngeun Kim

[1] Matthew Cook,et al. Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[2] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .

[3] Lei Deng,et al. Spatio-Temporal Backpropagation for Training High-Performance Spiking Neural Networks , 2017, Front. Neurosci..

[4] Emre Neftci,et al. Surrogate Gradient Learning in Spiking Neural Networks: Bringing the Power of Gradient-based optimization to spiking neural networks , 2019, IEEE Signal Processing Magazine.

[5] Kaushik Roy,et al. Conditional Deep Learning for energy-efficient and enhanced pattern recognition , 2015, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[6] Eimantas Ledinauskas,et al. Training Deep Spiking Neural Networks , 2020, ArXiv.

[7] Jyrki Alakuijala,et al. Temporal Coding in Spiking Neural Networks with Alpha Synaptic Function , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8] Anthony N. Burkitt,et al. A Review of the Integrate-and-fire Neuron Model: I. Homogeneous Synaptic Input , 2006, Biological Cybernetics.

[9] Robert A. Legenstein,et al. Long short-term memory and Learning-to-learn in networks of spiking neurons , 2018, NeurIPS.

[10] Kaushik Roy,et al. Towards Scalable, Efficient and Accurate Deep Spiking Neural Networks with Backward Residual Connections, Stochastic Softmax and Hybridization , 2019, ArXiv.

[11] Geoffrey E. Hinton,et al. A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.

[12] Kaushik Roy,et al. Enabling Deep Spiking Neural Networks with Hybrid Conversion and Spike Timing Dependent Backpropagation , 2020, ICLR.

[13] Rajit Manohar,et al. The impact of on-chip communication on memory technologies for neuromorphic systems , 2018, Journal of Physics D: Applied Physics.

[14] Aleksander Madry,et al. How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift) , 2018, NeurIPS.

[15] Mohammed Bennamoun,et al. Training Spiking Neural Networks Using Lessons From Deep Learning , 2021, ArXiv.

[16] Garrick Orchard,et al. HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17] Kaushik Roy,et al. Towards spike-based machine intelligence with neuromorphic computing , 2019, Nature.

[18] Lei Deng,et al. Direct Training for Spiking Neural Networks: Faster, Larger, Better , 2018, AAAI.

[19] Hong Wang,et al. Loihi: A Neuromorphic Manycore Processor with On-Chip Learning , 2018, IEEE Micro.

[20] Kaushik Roy,et al. Going Deeper in Spiking Neural Networks: VGG and Residual Architectures , 2018, Front. Neurosci..

[21] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[22] Kaushik Roy,et al. Inherent Adversarial Robustness of Deep Spiking Neural Networks: Effects of Discrete Input Encoding and Non-Linear Activations , 2020, ECCV.

[23] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[24] Kaushik Roy,et al. Enabling Spike-Based Backpropagation for Training Deep Neural Network Architectures , 2019, Frontiers in Neuroscience.

[25] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[26] Kiyoung Choi,et al. Deep neural networks with weighted spikes , 2018, Neurocomputing.

[27] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[28] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.

[29] H. T. Kung,et al. BranchyNet: Fast inference via early exiting from deep neural networks , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[30] Luping Shi,et al. CIFAR10-DVS: An Event-Stream Dataset for Object Classification , 2017, Front. Neurosci..

[31] Bernard Brezzo,et al. TrueNorth: Design and Tool Flow of a 65 mW 1 Million Neuron Programmable Neurosynaptic Chip , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[32] Kaushik Roy,et al. RMP-SNN: Residual Membrane Potential Neuron for Enabling Deeper High-Accuracy and Low-Latency Spiking Neural Network , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Jürgen Schmidhuber,et al. LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[34] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[35] Tobi Delbrück,et al. Training Deep Spiking Neural Networks Using Backpropagation , 2016, Front. Neurosci..

[36] W. Wildman,et al. Theoretical Neuroscience , 2014 .

[37] Shih-Chii Liu,et al. Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification , 2017, Front. Neurosci..

[38] Matthew Cook,et al. Unsupervised learning of digit recognition using spike-timing-dependent plasticity , 2015, Front. Comput. Neurosci..

[39] Hesham Mostafa,et al. Supervised Learning Based on Temporal Coding in Spiking Neural Networks , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[40] Mark Horowitz,et al. 1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[41] Ryad Benosman,et al. HATS: Histograms of Averaged Time Surfaces for Robust Event-Based Object Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[43] Zhaofei Yu,et al. Incorporating Learnable Membrane Time Constant to Enhance Learning of Spiking Neural Networks , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[44] Nitish V. Thakor,et al. HFirst: A Temporal Approach to Object Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45] Deepak Khosla,et al. Spiking Deep Convolutional Neural Networks for Energy-Efficient Object Recognition , 2014, International Journal of Computer Vision.

[46] Anthony N. Burkitt,et al. A review of the integrate-and-fire neuron model: II. Inhomogeneous synaptic input and network properties , 2006, Biological Cybernetics.

[47] Sungroh Yoon,et al. T2FSNN: Deep Spiking Neural Networks with Time-to-first-spike Coding , 2020, 2020 57th ACM/IEEE Design Automation Conference (DAC).