Automatic Paraphasia Detection from Aphasic Speech: A Preliminary Study

Aphasia is an acquired language disorder resulting from brain damage that can cause significant communication difficulties. Aphasic speech is often characterized by errors known as paraphasias, the analysis of which can be used to determine an appropriate course of treatment and to track an individual’s recovery progress. Being able to detect paraphasias automatically has many potential clinical benefits; however, this problem has not previously been investigated in the literature. In this paper, we perform the first study on detecting phonemic and neologistic paraphasias from scripted speech samples in AphasiaBank. We propose a speech recognition system with task-specific language models to transcribe aphasic speech automatically. We investigate features based on speech duration, Goodness of Pronunciation, phone edit distance, and Dynamic Time Warping on phoneme posteriorgrams. Our results demonstrate the feasibility of automatic paraphasia detection and outline the path toward enabling this system in real-world clinical applications.

[1]  Brian MacWhinney,et al.  AphasiaBank: A Resource for Clinicians , 2012, Seminars in Speech and Language.

[2]  Steve J. Young,et al.  Phone-level pronunciation scoring and assessment for interactive language learning , 2000, Speech Commun..

[3]  Frank Rudzicz,et al.  Using text and acoustic features to diagnose progressive aphasia and its subtypes , 2013, INTERSPEECH.

[4]  Andrew W. Senior,et al.  Fast and accurate recurrent neural network acoustic models for speech recognition , 2015, INTERSPEECH.

[5]  James R. Glass,et al.  A Comparison-based Approach to Mispronunciation Detection by , 2012 .

[6]  K. Hacioglu,et al.  TESTING SUPRASEGMENTAL ENGLISH THROUGH PARROTING , 2010 .

[7]  Margaret Forbes,et al.  AphasiaBank: Methods for studying discourse , 2011, Aphasiology.

[8]  James R. Glass,et al.  Mispronunciation detection via dynamic time warping on deep belief network-based posteriorgrams , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  M. Garrett,et al.  Lexical retrieval and its breakdown in aphasia and developmental language impairment , 2013 .

[10]  Lyndsey Nickels,et al.  Therapy for naming disorders: Revisiting, revising, and reviewing , 2002 .

[11]  Isabel Trancoso,et al.  Automatic word naming recognition for treatment and assessment of aphasia , 2012, INTERSPEECH.

[12]  James R. Glass,et al.  Pronunciation assessment via a comparison-based system , 2013, SLaTE.

[13]  Emily Mower Provost,et al.  Improving Automatic Recognition of Aphasic Speech with AphasiaBank , 2016, INTERSPEECH.

[14]  Dimitra Vergyri,et al.  Learning diagnostic models using speech and language measures , 2008, 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[15]  Kaisheng Yao,et al.  KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[16]  Emily Mower Provost,et al.  Modeling pronunciation, rhythm, and intonation for automatic assessment of speech quality in aphasia rehabilitation , 2014, INTERSPEECH.

[17]  Emily Mower Provost,et al.  Automatic analysis of speech quality for aphasia treatment , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Andrew W. Senior,et al.  Long short-term memory recurrent neural network architectures for large scale acoustic modeling , 2014, INTERSPEECH.

[19]  Jasha Droppo,et al.  Multi-task learning in deep neural networks for improved phoneme recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Peter Bell,et al.  Complementary tasks for context-dependent deep neural network acoustic models , 2015, INTERSPEECH.

[21]  Leora R Cherney,et al.  Communication partner training in aphasia: a systematic review. , 2010, Archives of physical medicine and rehabilitation.

[22]  Peter Bell,et al.  Regularization of context-dependent deep neural networks with context-independent multi-task training , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[24]  Isabel Trancoso,et al.  Automatic word naming recognition for an on-line aphasia treatment system , 2013, Comput. Speech Lang..

[25]  R. Teasell,et al.  Rehabilitation of Aphasia: More Is Better , 2003, Topics in stroke rehabilitation.

[26]  Heidi Christensen,et al.  Combining in-domain and out-of-domain speech data for automatic recognition of disordered speech , 2013, INTERSPEECH.

[27]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[29]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[30]  G. Mcreddie Aphasia , 1868, The Indian medical gazette.

[31]  Emily Mower Provost,et al.  Automatic Assessment of Speech Intelligibility for Individuals With Aphasia , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.