A spiking network that learns to extract spike signatures from speech signals

Spiking neural networks (SNNs) with adaptive synapses reflect core properties of biological neural networks. Speech recognition, as an application involving audio coding and dynamic learning, provides a good test problem to study SNN functionality. We present a simple, novel, and efficient nonrecurrent SNN that learns to convert a speech signal into a spike train signature. The signature is distinguishable from signatures for other speech signals representing different words, thereby enabling digit recognition and discrimination in devices that use only spiking neurons. The method uses a small, nonrecurrent SNN consisting of Izhikevich neurons equipped with spike timing dependent plasticity (STDP) and biologically realistic synapses. This approach introduces an efficient and fast network without error-feedback training, although it does require supervised training. The new simulation results produce discriminative spike train patterns for spoken digits in which highly correlated spike trains belong to the same category and low correlated patterns belong to different categories. The proposed SNN is evaluated using a spoken digit recognition task where a subset of the Aurora speech dataset is used. The experimental results show that the network performs well in terms of accuracy rate and complexity.

[1]  Yong Zhang,et al.  A Digital Liquid State Machine With Biologically Inspired Learning and Its Application to Speech Recognition , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[2]  John G. Harris,et al.  Automatic speech recognition using a predictive echo state network classifier , 2007, Neural Networks.

[3]  Trac D. Tran,et al.  Structured sparse representation with low-rank interference , 2014, 2014 48th Asilomar Conference on Signals, Systems and Computers.

[4]  Simei Gomes Wysoski,et al.  Fast and adaptive network of spiking neurons for multi-view visual pattern recognition , 2008, Neurocomputing.

[5]  Theodore W. Berger,et al.  A new dynamic synapse neural network for speech recognition , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[6]  Nikola Kasabov,et al.  Dynamic evolving spiking neural networks for on-line spatio- and spectro-temporal pattern recognition. , 2013, Neural networks : the official journal of the International Neural Network Society.

[7]  Herbert Jaeger,et al.  Reservoir computing approaches to recurrent neural network training , 2009, Comput. Sci. Rev..

[8]  Gustavo Deco,et al.  Temporal clustering with spiking neurons and dynamic synapses: towards technological applications , 2001, Neural Networks.

[9]  T. Martin McGinnity,et al.  Neuro-inspired Speech Recognition with Recurrent Spiking Neurons , 2008, ICANN.

[10]  The Accounting Review , 1972 .

[11]  Jonathan D Victor,et al.  Spike train metrics , 2005, Current Opinion in Neurobiology.

[12]  B. Schrauwen,et al.  Isolated word recognition with the Liquid State Machine: a case study , 2005, Inf. Process. Lett..

[13]  Simei Gomes Wysoski,et al.  Evolving spiking neural networks for audiovisual information processing , 2010, Neural Networks.

[14]  Timothée Masquelier,et al.  Acquisition of visual features through probabilistic spike-timing-dependent plasticity , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[15]  Theodore W. Berger,et al.  A new approach for isolated word recognition using dynamic synapse neural networks , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[16]  Benjamin Schrauwen,et al.  The Introduction of Time-Scales in Reservoir Computing, Applied to Isolated Digits Recognition , 2007, ICANN.

[17]  Wulfram Gerstner,et al.  A History of Spike-Timing-Dependent Plasticity , 2011, Front. Syn. Neurosci..

[18]  Hossein Sameti,et al.  Mel-scaled Discrete Wavelet Transform and dynamic features for the Persian phoneme recognition , 2011, 2011 International Symposium on Artificial Intelligence and Signal Processing (AISP).

[19]  Andrzej Kasiński,et al.  Comparison of supervised learning methods for spike time coding in spiking neural networks , 2006 .

[20]  Y. Dan,et al.  Spike timing-dependent plasticity: a Hebbian learning rule. , 2008, Annual review of neuroscience.

[21]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[22]  Mark C. W. van Rossum,et al.  Computation with populations codes in layered networks of integrate-and-fire neurons , 2004, Neurocomputing.

[23]  Dezhe Z. Jin,et al.  Noise-Robust Speech Recognition Through Auditory Feature Detection and Spike Sequence Decoding , 2014, Neural Computation.

[24]  T.W. Berger,et al.  Speech recognition based on fundamental functional principles of the brain , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[25]  Stefan Wermter,et al.  Spike-timing-dependent synaptic plasticity: from single spikes to spike trains , 2004, Neurocomputing.

[26]  Timothée Masquelier,et al.  Competitive STDP-Based Spike Pattern Learning , 2009, Neural Computation.

[27]  Anthony S. Maida,et al.  Training a Hidden Markov Model with a Bayesian Spiking Neural Network , 2016, Journal of Signal Processing Systems.

[28]  Ammar Belatreche,et al.  An online supervised learning method for spiking neural networks with adaptive structure , 2014, Neurocomputing.

[29]  Timothée Masquelier,et al.  Learning to recognize objects using waves of spikes and Spike Timing-Dependent Plasticity , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[30]  Jean-Pascal Pfister,et al.  Optimal Spike-Timing-Dependent Plasticity for Precise Action Potential Firing in Supervised Learning , 2005, Neural Computation.

[31]  Benjamin Schrauwen,et al.  Reservoir-based techniques for speech recognition , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[32]  Timothée Masquelier,et al.  Unsupervised Learning of Visual Features through Spike Timing Dependent Plasticity , 2007, PLoS Comput. Biol..

[33]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[34]  A. W. M. van den Enden,et al.  Discrete Time Signal Processing , 1989 .

[35]  Wolfgang Maass,et al.  Emergence of Dynamic Memory Traces in Cortical Microcircuit Models through STDP , 2013, The Journal of Neuroscience.

[36]  E. Bienenstock,et al.  Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex , 1982, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[37]  Gustavo Deco,et al.  Speech recognition with spiking neurons and dynamic synapses: a model motivated by the human auditory pathway , 2002, Neurocomputing.

[38]  G. R. Doddington,et al.  Computers: Speech recognition: Turning theory to practice: New ICs have brought the requisite computer power to speech technology; an evaluation of equipment shows where it stands today , 1981, IEEE Spectrum.

[39]  Liam McDaid,et al.  SWAT: A Spiking Neural Network Training Algorithm for Classification Problems , 2010, IEEE Transactions on Neural Networks.

[40]  Eugene M. Izhikevich,et al.  Simple model of spiking neurons , 2003, IEEE Trans. Neural Networks.

[41]  Jonathan D. Victor,et al.  Metric-space analysis of spike trains: theory, algorithms and application , 1998, q-bio/0309031.