Low-power neuromorphic speech recognition engine with coarse-grain sparsity

In recent years, we have seen a surge of interest in neuromorphic computing and its hardware design for cognitive applications. In this work, we present new neuromorphic architecture, circuit, and device co-designs that enable spike-based classification for speech recognition task. The proposed neuromorphic speech recognition engine supports a sparsely connected deep spiking network with coarse granularity, leading to large memory reduction with minimal index information. Simulation results show that the proposed deep spiking neural network accelerator achieves phoneme error rate (PER) of 20.5% for TIMIT database, and consume 2.57mW in 40nm CMOS for real-time performance. To alleviate the memory bottleneck, the usage of non-volatile memory is also evaluated and discussed.

[1]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[2]  Hao Jiang,et al.  A Memristor Crossbar Based Computing Engine Optimized for High Speed and Accuracy , 2016, 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).

[3]  Brian Kingsbury,et al.  New types of deep neural network learning for speech recognition and related applications: an overview , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[5]  Andrew S. Cassidy,et al.  Convolutional networks for fast, energy-efficient neuromorphic computing , 2016, Proceedings of the National Academy of Sciences.

[6]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[7]  Catherine E. Graves,et al.  Low-Power, Self-Rectifying, and Forming-Free Memristor with an Asymmetric Programing Voltage for a High-Density Crossbar Application. , 2016, Nano letters.

[8]  Exhibitor,et al.  International Conference On Acoustics, Speech, And Signal Processing , 1993, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Andrew S. Cassidy,et al.  A million spiking-neuron integrated circuit with a scalable communication network and interface , 2014, Science.

[10]  Chaitali Chakrabarti,et al.  Efficient memory compression in deep neural networks using coarse-grain sparsification for speech applications , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).