Robust Sequence Recognition Using Biologically Inspired Temporal Learning Mechanisms

The biological neural networks communicate with each other through the spike sequences fired by neurons. In this work, we propose a biologically-plausible network structure which contains spiking neurons for sequence recognition. This is a system that is consistent with the auditory coding, learning, and decoding functions. The model can reveal the neural mechanism of upstream and downstream neurons. In this model, we encode the external stimuli into the spatiotemporal spikes and further extract features of the encoded patterns by K-Singular Value Decomposition (K-SVD) algorithm. Then we use the time learning rules to learn the spike patterns and finally use downstream neurons to decode sequence order. In our experiment, two speech datasets (TIDIGIT and FSDD) have been used in evaluating the performance of the proposed speech recognition system, which are compared with other models. Experimental results show that our model can still maintain good performance in noise conditions and finally obtain a superior accuracy in sequence identification than other typical algorithms.

[1]  Ming Zhang,et al.  Combining Bottom-Up and Top-Down Visual Mechanisms for Color Constancy Under Varying Illumination , 2019, IEEE Transactions on Image Processing.

[2]  Sarah M. N. Woolley,et al.  Sparse and Background-Invariant Coding of Vocalizations in Auditory Scenes , 2013, Neuron.

[3]  Hojjat Adeli,et al.  Spiking Neural Networks , 2009, Int. J. Neural Syst..

[4]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[5]  Haizhou Li,et al.  A Spiking Neural Network System for Robust Sequence Recognition , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[6]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[7]  Shao-Bing Gao,et al.  A Retinal Mechanism Inspired Color Constancy Model , 2016, IEEE Transactions on Image Processing.

[8]  Daniel Jachyra,et al.  Neural Network Structure for Spatio-Temporal Long-Term Memory , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[9]  J. C. Middlebrooks,et al.  Cortical representation of auditory space: information-bearing features of spike patterns. , 2002, Journal of neurophysiology.

[10]  Yong Zhang,et al.  A Digital Liquid State Machine With Biologically Inspired Learning and Its Application to Speech Recognition , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[11]  J. Simon,et al.  Neural coding of continuous speech in auditory cortex during monaural and dichotic listening. , 2012, Journal of neurophysiology.

[12]  Haibo He,et al.  Spatio–Temporal Memories for Machine Learning: A Long-Term Memory Organization , 2009, IEEE Transactions on Neural Networks.

[13]  Yong-Jie Li,et al.  Underwater Image Enhancement Using Adaptive Retinal Mechanisms , 2019, IEEE Transactions on Image Processing.

[14]  Haizhou Li,et al.  A Biologically Plausible Speech Recognition Framework Based on Spiking Neural Networks , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[15]  H. Sompolinsky,et al.  The tempotron: a neuron that learns spike timing–based decisions , 2006, Nature Neuroscience.

[16]  D. Jin Decoding spatiotemporal spike sequences via the finite state automata dynamics of spiking neural networks , 2008 .

[17]  Yongjie Li,et al.  A Color Constancy Model with Double-Opponency Mechanisms , 2013, 2013 IEEE International Conference on Computer Vision.

[18]  Andrew S. Cassidy,et al.  A million spiking-neuron integrated circuit with a scalable communication network and interface , 2014, Science.

[19]  J J Hopfield,et al.  What is a moment? Transient synchrony as a collective mechanism for spatiotemporal integration. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[20]  A. Bruckstein,et al.  On the uniqueness of overcomplete dictionaries, and a practical way to retrieve them , 2006 .

[21]  Shih-Chii Liu,et al.  Speaker-independent isolated digit recognition using an AER silicon cochlea , 2011, 2011 IEEE Biomedical Circuits and Systems Conference (BioCAS).

[22]  Kai-Fu Yang,et al.  Color Constancy Using Double-Opponency , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Wofgang Maas,et al.  Networks of spiking neurons: the third generation of neural network models , 1997 .

[24]  Rong Xiao,et al.  A neurally inspired pattern recognition approach with latency-phase encoding and precise-spike-driven rule in spiking neural network , 2017, 2017 IEEE International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE Conference on Robotics, Automation and Mechatronics (RAM).

[25]  Anthony S. Maida,et al.  A spiking network that learns to extract spike signatures from speech signals , 2016, Neurocomputing.

[26]  Dezhe Z Jin,et al.  Spiking neural network for recognizing spatiotemporal sequences of spikes. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[27]  Hao,et al.  Improving color constancy by selecting suitable set of training images , 2019 .

[28]  H. Sompolinsky,et al.  Time-Warp–Invariant Neuronal Processing , 2009, PLoS biology.

[29]  Shao-Bing Gao,et al.  A Retina Inspired Model for Enhancing Visibility of Hazy Images , 2015, Front. Comput. Neurosci..