论文信息 - Bio-inspired Multi-layer Spiking Neural Network Extracts Discriminative Features from Speech Signals

Bio-inspired Multi-layer Spiking Neural Network Extracts Discriminative Features from Speech Signals

Spiking neural networks (SNNs) enable power-efficient implementations due to their sparse, spike-based coding scheme. This paper develops a bio-inspired SNN that uses unsupervised learning to extract discriminative features from speech signals, which can subsequently be used in a classifier. The architecture consists of a spiking convolutional/pooling layer followed by a fully connected spiking layer for feature discovery. The convolutional layer of leaky, integrate-and-fire (LIF) neurons represents primary acoustic features. The fully connected layer is equipped with a probabilistic spike-timing-dependent plasticity learning rule. This layer represents the discriminative features through probabilistic, LIF neurons. To assess the discriminative power of the learned features, they are used in a hidden Markov model (HMM) for spoken digit recognition. The experimental results show performance above 96% that compares favorably with popular statistical feature extraction methods. Our results provide a novel demonstration of unsupervised feature acquisition in an SNN.

Anthony S. Maida | Amirhossein Tavanaei | A. Maida | A. Tavanaei

[1] Tara N. Sainath,et al. FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[2] Deepak Khosla,et al. Spiking Deep Convolutional Neural Networks for Energy-Efficient Object Recognition , 2014, International Journal of Computer Vision.

[3] Anthony S. Maida,et al. A spiking network that learns to extract spike signatures from speech signals , 2016, Neurocomputing.

[4] R. G. Leonard,et al. A database for speaker-independent digit recognition , 1984, ICASSP.

[5] Shih-Chii Liu,et al. Effective sensor fusion with event-based sensors and deep network architectures , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[6] Benjamin Schrauwen,et al. Reservoir-based techniques for speech recognition , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[7] Timothée Masquelier,et al. Unsupervised Learning of Visual Features through Spike Timing Dependent Plasticity , 2007, PLoS Comput. Biol..

[8] Yoshua Bengio,et al. STDP-Compatible Approximation of Backpropagation in an Energy-Based Model , 2017, Neural Computation.

[9] Anthony S. Maida,et al. Training a Hidden Markov Model with a Bayesian Spiking Neural Network , 2016, Journal of Signal Processing Systems.

[10] Gerald Penn,et al. Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[11] Timothée Masquelier,et al. Acquisition of visual features through probabilistic spike-timing-dependent plasticity , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[12] Yann LeCun,et al. Learning Invariant Feature Hierarchies , 2012, ECCV Workshops.

[13] Wofgang Maas,et al. Networks of spiking neurons: the third generation of neural network models , 1997 .

[14] Nikola Kasabov,et al. Dynamic evolving spiking neural networks for on-line spatio- and spectro-temporal pattern recognition. , 2013, Neural networks : the official journal of the International Neural Network Society.

[15] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[16] Matthew Cook,et al. Unsupervised learning of digit recognition using spike-timing-dependent plasticity , 2015, Front. Comput. Neurosci..

[17] Tara N. Sainath,et al. Deep convolutional neural networks for LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18] Nikil D. Dutt,et al. Categorization and decision-making in a neurobiologically plausible spiking network using a STDP-like learning rule , 2013, Neural Networks.

[19] Trac D. Tran,et al. Structured sparse representation with low-rank interference , 2014, 2014 48th Asilomar Conference on Signals, Systems and Computers.

[20] Timothée Masquelier,et al. Bio-inspired unsupervised learning of visual features leads to robust invariant object recognition , 2015, Neurocomputing.

[21] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[22] Sander M. Bohte,et al. Efficient forward propagation of time-sequences in convolutional neural networks using Deep Shifting , 2016, ArXiv.

[23] J. Rouat,et al. Exploration of rank order coding with spiking neural networks for speech recognition , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[24] Hojjat Adeli,et al. Spiking Neural Networks , 2009, Int. J. Neural Syst..

[25] G. R. Doddington,et al. Computers: Speech recognition: Turning theory to practice: New ICs have brought the requisite computer power to speech technology; an evaluation of equipment shows where it stands today , 1981, IEEE Spectrum.

[26] Liam McDaid,et al. SWAT: A Spiking Neural Network Training Algorithm for Classification Problems , 2010, IEEE Transactions on Neural Networks.

[27] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.

[28] Lou Boves,et al. Spoken digit recognition using a hierarchical temporal memory , 2008, INTERSPEECH.

[29] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[30] T.W. Berger,et al. Speech recognition based on fundamental functional principles of the brain , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[31] Simei Gomes Wysoski,et al. Fast and adaptive network of spiking neurons for multi-view visual pattern recognition , 2008, Neurocomputing.

[32] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[33] Y. Dan,et al. Spike timing-dependent plasticity: from synapse to perception. , 2006, Physiological reviews.