Progressive Tandem Learning for Pattern Recognition With Deep Spiking Neural Networks

Spiking neural networks (SNNs) have shown clear advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency, due to their event-driven nature and sparse communication. However, the training of deep SNNs is not straightforward. In this paper, we propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition, which is referred to as progressive tandem learning of deep SNNs. By studying the equivalence between ANNs and SNNs in the discrete representation space, a primitive network conversion method is introduced that takes full advantage of spike count to approximate the activation value of analog neurons. To compensate for the approximation errors arising from the primitive network conversion, we further introduce a layer-wise learning method with an adaptive training scheduler to fine-tune the network weights. The progressive tandem learning framework also allows hardware constraints, such as limited weight precision and fan-in connections, to be progressively imposed during training. The SNNs thus trained have demonstrated remarkable classification and regression capabilities on large-scale object recognition, image reconstruction, and speech separation tasks, while requiring at least an order of magnitude reduced inference time and synaptic operations than other state-of-the-art SNN implementations. It, therefore, opens up a myriad of opportunities for pervasive mobile and embedded devices with a limited power budget.

[1]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[2]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, NIPS.

[3]  Deepak Khosla,et al.  Spiking Deep Convolutional Neural Networks for Energy-Efficient Object Recognition , 2014, International Journal of Computer Vision.

[4]  Shih-Chii Liu,et al.  Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification , 2017, Front. Neurosci..

[5]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[6]  Mingguo Zhao,et al.  Towards artificial general intelligence with hybrid Tianjic chip architecture , 2019, Nature.

[7]  Bernabé Linares-Barranco,et al.  Mapping from Frame-Driven to Frame-Free Event-Driven Vision Systems by Low-Rate Rate Coding and Coincidence Processing--Application to Feedforward ConvNets , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Robert A. Legenstein,et al.  Long short-term memory and Learning-to-learn in networks of spiking neurons , 2018, NeurIPS.

[9]  Matthew Cook,et al.  Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[10]  Terrence J Sejnowski,et al.  Communication in Neuronal Networks , 2003, Science.

[11]  Sungroh Yoon,et al.  Spiking-YOLO: Spiking Neural Network for Energy-Efficient Object Detection , 2020, AAAI.

[12]  Emre Neftci,et al.  Surrogate Gradient Learning in Spiking Neural Networks: Bringing the Power of Gradient-based optimization to spiking neural networks , 2019, IEEE Signal Processing Magazine.

[13]  Kaushik Roy,et al.  Enabling Deep Spiking Neural Networks with Hybrid Conversion and Spike Timing Dependent Backpropagation , 2020, ICLR.

[14]  Lei Deng,et al.  Spatio-Temporal Backpropagation for Training High-Performance Spiking Neural Networks , 2017, Front. Neurosci..

[15]  Sepp Hochreiter,et al.  The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[16]  Mounya Elhilali,et al.  Modelling auditory attention , 2017, Philosophical Transactions of the Royal Society B: Biological Sciences.

[17]  Andries P. Hekstra,et al.  Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[18]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[19]  Chng Eng Siong,et al.  Target Speaker Extraction for Multi-Talker Speaker Verification , 2019, INTERSPEECH.

[20]  Shinji Watanabe,et al.  Diarization is Hard: Some Experiences and Lessons Learned for the JHU Team in the Inaugural DIHARD Challenge , 2018, INTERSPEECH.

[21]  N. Mesgarani,et al.  Selective cortical representation of attended speaker in multi-talker speech perception , 2012, Nature.

[22]  Yifan Gong,et al.  Robust automatic speech recognition : a bridge to practical application , 2015 .

[23]  Surya Ganguli,et al.  SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks , 2017, Neural Computation.

[24]  Jonathan Le Roux,et al.  SDR – Half-baked or Well Done? , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25]  Gang Pan,et al.  Spiking Deep Residual Network , 2018, ArXiv.

[26]  Kaushik Roy,et al.  Towards spike-based machine intelligence with neuromorphic computing , 2019, Nature.

[27]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[28]  Lei Deng,et al.  Direct Training for Spiking Neural Networks: Faster, Larger, Better , 2018, AAAI.

[29]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[30]  Kay Chen Tan,et al.  A Tandem Learning Rule for Effective Training and Rapid Inference of Deep Spiking Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Kaushik Roy,et al.  Going Deeper in Spiking Neural Networks: VGG and Residual Architectures , 2018, Front. Neurosci..

[32]  Chris Eliasmith,et al.  Training Spiking Deep Networks for Neuromorphic Hardware , 2016, ArXiv.

[33]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[34]  Oriol Vinyals,et al.  Neural Discrete Representation Learning , 2017, NIPS.

[35]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[36]  Garrick Orchard,et al.  SLAYER: Spike Layer Error Reassignment in Time , 2018, NeurIPS.

[37]  Michael Pfeiffer,et al.  Deep Learning With Spiking Neurons: Opportunities and Challenges , 2018, Front. Neurosci..

[38]  Huajin Tang,et al.  STCA: Spatio-Temporal Credit Assignment with Delayed Feedback in Deep Spiking Neural Networks , 2019, IJCAI.

[39]  Hong Wang,et al.  Loihi: A Neuromorphic Manycore Processor with On-Chip Learning , 2018, IEEE Micro.

[40]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[41]  A Tandem Learning Rule for Efficient and Rapid Inference on Deep Spiking Neural Networks. , 2019, 1907.01167.

[42]  Zhuo Chen,et al.  Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[43]  Haizhou Li,et al.  SpEx: Multi-Scale Time Domain Speaker Extraction Network , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[44]  Yi Hu,et al.  Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[45]  Craig M. Vineyard,et al.  Training deep neural networks for binary communication with the Whetstone method , 2019 .

[46]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[47]  Sungroh Yoon,et al.  Spiking-YOLO: Spiking Neural Network for Real-time Object Detection , 2019, ArXiv.

[48]  Bo Chen,et al.  Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[49]  Kiyoshi Kotani,et al.  Cultured Cortical Neurons Can Perform Blind Source Separation According to the Free-Energy Principle , 2015, PLoS Comput. Biol..

[50]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[51]  Andrés Montoyo,et al.  Advances on natural language processing , 2007, Data Knowl. Eng..

[52]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[53]  Kaushik Roy,et al.  Enabling Spike-Based Backpropagation for Training Deep Neural Network Architectures , 2019, Frontiers in Neuroscience.

[54]  Alexander G. Anderson,et al.  The High-Dimensional Geometry of Binary Neural Networks , 2017, ICLR.

[55]  Andrew S. Cassidy,et al.  A million spiking-neuron integrated circuit with a scalable communication network and interface , 2014, Science.

[56]  Nima Mesgarani,et al.  Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[57]  DeLiang Wang,et al.  Deep learning reinvents the hearing aid , 2017, IEEE Spectrum.

[58]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[59]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Geoffrey Zweig,et al.  Toward Human Parity in Conversational Speech Recognition , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.