Deep Residual Learning in Spiking Neural Networks

Deep Spiking Neural Networks (SNNs) present optimization difficulties for gradient-based approaches due to discrete binary activation and complex spatialtemporal dynamics. Considering the huge success of ResNet in deep learning, it would be natural to train deep SNNs with residual learning. Previous Spiking ResNet mimics the standard residual block in ANNs and simply replaces ReLU activation layers with spiking neurons, which suffers the degradation problem and can hardly implement residual learning. In this paper, we propose the spikeelement-wise (SEW) ResNet to realize residual learning in deep SNNs. We prove that the SEW ResNet can easily implement identity mapping and overcome the vanishing/exploding gradient problems of Spiking ResNet. We evaluate our SEW ResNet on ImageNet, DVS Gesture, and CIFAR10-DVS datasets, and show that SEW ResNet outperforms the state-of-the-art directly trained SNNs in both accuracy and time-steps. Moreover, SEW ResNet can achieve higher performance by simply adding more layers, providing a simple method to train deep SNNs. To our best knowledge, this is the first time that directly training deep SNNs with more than 100 layers becomes possible. Our codes are available at https: //github.com/fangwei123456/Spike-Element-Wise-ResNet.

[1]  Lei Deng,et al.  Spatio-Temporal Backpropagation for Training High-Performance Spiking Neural Networks , 2017, Front. Neurosci..

[2]  Jyrki Alakuijala,et al.  Temporal Coding in Spiking Neural Networks with Alpha Synaptic Function , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Yuan Xie,et al.  Comparing SNNs and RNNs on Neuromorphic Vision Datasets: Similarities and Differences , 2020, Neural Networks.

[4]  Jinseok Kim,et al.  Unifying Activation- and Timing-based Learning Rules for Spiking Neural Networks , 2020, NeurIPS.

[5]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Friedemann Zenke,et al.  The remarkable robustness of surrogate gradient learning for instilling complex function in spiking neural networks , 2020, bioRxiv.

[7]  Timothée Masquelier,et al.  Deep Learning in Spiking Neural Networks , 2018, Neural Networks.

[8]  Morteza Haghir Chehreghani,et al.  Convolutional Spiking Neural Networks for Spatio-Temporal Feature Extraction , 2020, Neural Processing Letters.

[9]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[10]  Kaushik Roy,et al.  Enabling Spike-Based Backpropagation for Training Deep Neural Network Architectures , 2019, Frontiers in Neuroscience.

[11]  Tobi Delbrück,et al.  Training Deep Spiking Neural Networks Using Backpropagation , 2016, Front. Neurosci..

[12]  Shih-Chii Liu,et al.  Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification , 2017, Front. Neurosci..

[13]  Hesham Mostafa,et al.  Surrogate Gradient Learning in Spiking Neural Networks: Bringing the Power of Gradient-based optimization to spiking neural networks , 2019, IEEE Signal Processing Magazine.

[14]  Federico Corradi,et al.  Effective and Efficient Computation with Multiple-timescale Spiking Recurrent Neural Networks , 2020, ICONS.

[15]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[16]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[17]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[18]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[20]  Kaushik Roy,et al.  Going Deeper in Spiking Neural Networks: VGG and Residual Architectures , 2018, Front. Neurosci..

[21]  Alex Krizhevsky,et al.  One weird trick for parallelizing convolutional neural networks , 2014, ArXiv.

[22]  Lei Deng,et al.  Going Deeper With Directly-Trained Larger Spiking Neural Networks , 2020, AAAI.

[23]  Byung-Gook Park,et al.  Low-Latency Spiking Neural Networks Using Pre-Charged Membrane Potential and Delayed Evaluation , 2021, Frontiers in Neuroscience.

[24]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[25]  Arindam Sanyal,et al.  Temporal-Coded Deep Spiking Neural Network with Easy Training and Robust Performance , 2019, AAAI.

[26]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[27]  T. Masquelier,et al.  S4NN: temporal backpropagation for spiking neural networks with one spike per neuron , 2019, Int. J. Neural Syst..

[28]  Garrick Orchard,et al.  SLAYER: Spike Layer Error Reassignment in Time , 2018, NeurIPS.

[29]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Razvan Pascanu,et al.  On the Number of Linear Regions of Deep Neural Networks , 2014, NIPS.

[31]  Ye Yuan,et al.  Homeostasis-Based CNN-to-SNN Conversion of Inception and Residual Architectures , 2019, ICONIP.

[32]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[33]  Luping Shi,et al.  CIFAR10-DVS: An Event-Stream Dataset for Object Classification , 2017, Front. Neurosci..

[34]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Kaushik Roy,et al.  RMP-SNN: Residual Membrane Potential Neuron for Enabling Deeper High-Accuracy and Low-Latency Spiking Neural Network , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[37]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[38]  Kaushik Roy,et al.  Enabling Deep Spiking Neural Networks with Hybrid Conversion and Spike Timing Dependent Backpropagation , 2020, ICLR.

[39]  Hao Wu,et al.  Mixed Precision Training , 2017, ICLR.

[40]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[41]  Kaushik Roy,et al.  Deep Spiking Neural Network: Energy Efficiency Through Time Based Coding , 2020, ECCV.

[42]  Kaushik Roy,et al.  Towards spike-based machine intelligence with neuromorphic computing , 2019, Nature.

[43]  Terrence J. Sejnowski,et al.  Gradient Descent for Spiking Neural Networks , 2017, NeurIPS.

[44]  Kaushik Roy,et al.  DIET-SNN: Direct Input Encoding With Leakage and Threshold Optimization in Deep Spiking Neural Networks , 2020, ArXiv.

[45]  Kaiming He,et al.  Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.

[46]  Jian Sun,et al.  Convolutional neural networks at constrained time cost , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[48]  Ruihao Gong,et al.  A Free Lunch From ANN: Towards Efficient, Accurate Spiking Neural Networks Calibration , 2021, ICML.

[49]  Thomas Pellegrini,et al.  Technical report: supervised training of convolutional spiking neural networks with PyTorch , 2019, ArXiv.

[50]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[51]  Peng Li,et al.  Temporal Spike Sequence Learning via Backpropagation for Deep Spiking Neural Networks , 2020, NeurIPS.

[52]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[53]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Wolfgang Maass,et al.  Optimized spiking neurons can classify images with high accuracy through temporal coding with two spikes , 2020, Nature Machine Intelligence.

[55]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[56]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Zhaofei Yu,et al.  Incorporating Learnable Membrane Time Constant to Enhance Learning of Spiking Neural Networks , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[58]  Tobi Delbrück,et al.  A Low Power, Fully Event-Based Gesture Recognition System , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Deepak Khosla,et al.  Spiking Deep Convolutional Neural Networks for Energy-Efficient Object Recognition , 2014, International Journal of Computer Vision.

[60]  Chris Eliasmith,et al.  Spiking Deep Networks with LIF Neurons , 2015, ArXiv.

[61]  Shi Gu,et al.  Optimal Conversion of Conventional Artificial Neural Networks to Spiking Neural Networks , 2021, ICLR.

[62]  Hesham Mostafa,et al.  Supervised Learning Based on Temporal Coding in Spiking Neural Networks , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[63]  Gang Pan,et al.  Spiking Deep Residual Network , 2018, ArXiv.

[64]  Kiyoung Choi,et al.  Deep neural networks with weighted spikes , 2018, Neurocomputing.