Brain Signals to Rescue Aphasia, Apraxia and Dysarthria Speech Recognition

In this paper, we propose a deep learning-based algorithm to improve the performance of automatic speech recognition (ASR) systems for aphasia, apraxia, and dysarthria speech by utilizing electroencephalography (EEG) features recorded synchronously with aphasia, apraxia, and dysarthria speech. We demonstrate a significant decoding performance improvement by more than 50% during test time for isolated speech recognition task and we also provide preliminary results indicating performance improvement for the more challenging continuous speech recognition task by utilizing EEG features. The results presented in this paper show the first step towards demonstrating the possibility of utilizing non-invasive neural signals to design a real-time robust speech prosthetic for stroke survivors recovering from aphasia, apraxia, and dysarthria. Our aphasia, apraxia, and dysarthria speech-EEG data set will be released to the public to help further advance this interesting and crucial research.

[1]  Gunnar Rätsch,et al.  Kernel PCA and De-Noising in Feature Spaces , 1998, NIPS.

[2]  Linda J. Ferrier,et al.  Dysarthric speakers' intelligibility and speech characteristics in relation to computer speech recognition , 1995 .

[3]  Yan Han,et al.  Speech Synthesis Using EEG , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  D. Benson,et al.  Aphasia: A Clinical Perspective , 1996 .

[5]  R. Knight,et al.  Redefining the role of Broca’s area in speech , 2015, Proceedings of the National Academy of Sciences.

[6]  Edward F. Chang,et al.  Speech synthesis from neural decoding of spoken sentences , 2019, Nature.

[7]  Gary Bishop,et al.  Automated Speech Recognition in Adult Stroke Survivors: Comparing Human and Computer Transcriptions , 2019, Folia Phoniatrica et Logopaedica.

[8]  Raymond D. Kent,et al.  Acoustic patterns of apraxia of speech. , 1983, Journal of speech and hearing research.

[9]  G. Mcreddie Aphasia , 1868, The Indian medical gazette.

[10]  Douglas D. O'Shaughnessy,et al.  Generalized mel frequency cepstral coefficients for large-vocabulary speaker-independent continuous-speech recognition , 1999, IEEE Trans. Speech Audio Process..

[11]  Navdeep Jaitly,et al.  Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.

[12]  Vladlen Koltun,et al.  An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling , 2018, ArXiv.

[13]  Keith Johnson,et al.  Encoding of Articulatory Kinematic Trajectories in Human Speech Sensorimotor Cortex , 2018, Neuron.

[14]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[15]  G. Tononi,et al.  Plastic Changes Following Imitation-Based Speech and Language Therapy for Aphasia , 2014, Neurorehabilitation and neural repair.

[16]  Alessandro Angrilli,et al.  EEG delta band as a marker of brain damage in aphasic patients after recovery of language , 2009, Neuropsychologia.

[17]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  J. Needleman,et al.  Statistical significance testing and p-values: Defending the indefensible? A discussion paper and position statement. , 2019, International journal of nursing studies.

[20]  Frank Rudzicz,et al.  Automatic speech recognition in the diagnosis of primary progressive aphasia , 2013, SLPAT.

[21]  Kirrie J Ballard,et al.  Feasibility of Automatic Speech Recognition for Providing Feedback During Tablet-Based Treatment for Apraxia of Speech Plus Aphasia. , 2019, American journal of speech-language pathology.

[22]  Emily Mower Provost,et al.  Improving Automatic Recognition of Aphasic Speech with AphasiaBank , 2016, INTERSPEECH.

[23]  Tara N. Sainath,et al.  A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).

[24]  Éric Gaussier,et al.  A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation , 2005, ECIR.

[25]  Thomas Elbert,et al.  Left-hemispheric abnormal EEG activity in relation to impairment and recovery in aphasic patients. , 2004, Psychophysiology.

[26]  A. Aronson,et al.  Differential diagnostic patterns of dysarthria. , 1969, Journal of speech and hearing research.

[27]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[28]  Ahmed H. Tewfik,et al.  Speech Recognition with No Speech or with Noisy Speech , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29]  Phil D. Green,et al.  Automatic speech recognition with sparse training data for dysarthric speakers , 2003, INTERSPEECH.

[30]  Ahmed Tewfik,et al.  Advancing Speech Recognition With No Speech Or With Noisy Speech , 2019, 2019 27th European Signal Processing Conference (EUSIPCO).