Automatic quantitative analysis of spontaneous aphasic speech

Abstract Spontaneous speech analysis plays an important role in the study and treatment of aphasia, but can be difficult to perform manually due to the time consuming nature of speech transcription and coding. Techniques in automatic speech recognition and assessment can potentially alleviate this problem by allowing clinicians to quickly process large amount of speech data. However, automatic analysis of spontaneous aphasic speech has been relatively under-explored in the engineering literature, partly due to the limited amount of available data and difficulties associated with aphasic speech processing. In this work, we perform one of the first large-scale quantitative analysis of spontaneous aphasic speech based on automatic speech recognition (ASR) output. We describe our acoustic modeling method that sets a new recognition benchmark on AphasiaBank, a large-scale aphasic speech corpus. We propose a set of clinically-relevant quantitative measures that are shown to be highly robust to automatic transcription errors. Finally, we demonstrate that these measures can be used to accurately predict the revised Western Aphasia Battery (WAB-R) Aphasia Quotient (AQ) without the need for manual transcripts. The results and techniques presented in our work will help advance the state-of-the-art in aphasic speech processing and make ASR-based technology for aphasia treatment more feasible in real-world clinical applications.

[1]  Emily Mower Provost,et al.  Automatic Paraphasia Detection from Aphasic Speech: A Preliminary Study , 2017, INTERSPEECH.

[2]  Mark Hasegawa-Johnson,et al.  State-Transition Interpolation and MAP Adaptation for HMM-based Dysarthric Speech Recognition , 2010, SLPAT@NAACL.

[3]  James R. Glass,et al.  A Comparison-based Approach to Mispronunciation Detection by , 2012 .

[4]  Jen-Tzung Chien,et al.  Automatic speech recognition for acoustical analysis and assessment of cantonese pathological voice and speech , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Steve J. Young,et al.  Phone-level pronunciation scoring and assessment for interactive language learning , 2000, Speech Commun..

[6]  Emily Mower Provost,et al.  Improving Automatic Recognition of Aphasic Speech with AphasiaBank , 2016, INTERSPEECH.

[7]  K. Willmes,et al.  The psychometric properties of the English language version of the Aachen Aphasia Test (EAAT) , 2000 .

[8]  K. Hacioglu,et al.  TESTING SUPRASEGMENTAL ENGLISH THROUGH PARROTING , 2010 .

[9]  Heidi Christensen,et al.  Learning speaker-specific pronunciations of disordered speech , 2013, INTERSPEECH.

[10]  Martina Piefke,et al.  Basic parameters of spontaneous speech as a sensitive method for measuring change during the course of aphasia. , 2008, International journal of language & communication disorders.

[11]  Emily Mower Provost,et al.  Automatic Assessment of Speech Intelligibility for Individuals With Aphasia , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[12]  Marc Brys,et al.  Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English , 2009 .

[13]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[14]  R. Logie,et al.  Age-of-acquisition, imagery, concreteness, familiarity, and ambiguity measures for 1,944 words , 1980 .

[15]  B. MacWhinney The CHILDES project: tools for analyzing talk , 1992 .

[16]  Frank Rudzicz,et al.  Comparing Humans and Automatic Speech Recognition Systems in Recognizing Dysarthric Speech , 2011, Canadian Conference on AI.

[17]  L. Murray,et al.  Functional measures of naming in aphasia: Word retrieval in confrontation naming versus connected speech , 2003 .

[18]  Leora R Cherney,et al.  Communication partner training in aphasia: a systematic review. , 2010, Archives of physical medicine and rehabilitation.

[19]  Dimitra Vergyri,et al.  Learning diagnostic models using speech and language measures , 2008, 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[20]  Heidi Christensen,et al.  Automatic selection of speakers for improved acoustic modelling: recognition of disordered speech with sparse data , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[21]  Heidi Christensen,et al.  Combining in-domain and out-of-domain speech data for automatic recognition of disordered speech , 2013, INTERSPEECH.

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Haipeng Wang,et al.  Analysis of auto-aligned and auto-segmented oral discourse by speakers with aphasia: A preliminary study on the acoustic parameter of duration. , 2013, Procedia, social and behavioral sciences.

[24]  A. Kertesz The Western Aphasia Battery , 1982 .

[25]  Serguei V. S. Pakhomov,et al.  Computerized Analysis of Speech and Language to Identify Psycholinguistic Correlates of Frontotemporal Lobar Degeneration , 2010, Cognitive and behavioral neurology : official journal of the Society for Behavioral and Cognitive Neurology.

[26]  Heather Harris Wright,et al.  Lexical diversity for adults with and without aphasia across discourse elicitation tasks , 2011, Aphasiology.

[27]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[28]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[29]  B Hallowell,et al.  A multinational comparison of aphasia management practices. , 2000, International journal of language & communication disorders.

[30]  Isabel Trancoso,et al.  Automatic word naming recognition for an on-line aphasia treatment system , 2013, Comput. Speech Lang..

[31]  R. Bastiaanse,et al.  Analysing the spontaneous speech of aphasic speakers , 2004 .

[32]  Mark Hasegawa-Johnson,et al.  Acoustic model adaptation using in-domain background models for dysarthric speech recognition , 2013, Comput. Speech Lang..

[33]  L. Tan,et al.  Measuring prosodic deficits in oral discourse by speakers with fluent aphasia , 2015 .

[34]  Prisca Stenneken,et al.  Diagnosing residual aphasia using spontaneous speech analysis , 2012 .

[35]  Margaret Forbes,et al.  AphasiaBank: Methods for studying discourse , 2011, Aphasiology.

[36]  James R. Glass,et al.  Mispronunciation detection via dynamic time warping on deep belief network-based posteriorgrams , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[37]  M. Schwartz,et al.  Semantic Factors in Verb Retrieval: An Effect of Complexity , 1998, Brain and Language.

[38]  Anna Basso,et al.  Aphasia and its therapy , 2003 .

[39]  Frank Rudzicz,et al.  Automatic speech recognition in the diagnosis of primary progressive aphasia , 2013, SLPAT.

[40]  Emily Mower Provost,et al.  Automatic analysis of speech quality for aphasia treatment , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[41]  Larry Boles,et al.  Conversational treatment in mild aphasia: A case study , 2009 .

[42]  Andreas Stolcke,et al.  SRILM at Sixteen: Update and Outlook , 2011 .

[43]  Emily Mower Provost,et al.  Modeling pronunciation, rhythm, and intonation for automatic assessment of speech quality in aphasia rehabilitation , 2014, INTERSPEECH.

[44]  H. Stadthagen-González,et al.  The Bristol norms for age of acquisition, imageability, and familiarity , 2006, Behavior research methods.

[45]  Kathleen C. Fraser,et al.  Automated classification of primary progressive aphasia subtypes from narrative speech transcripts , 2014, Cortex.

[46]  H. Goodglass Boston diagnostic aphasia examination , 2013 .

[47]  Heidi Christensen,et al.  A comparative study of adaptive, automatic recognition of disordered speech , 2012, INTERSPEECH.

[48]  Brian Roark,et al.  Spoken Language Derived Measures for Detecting Mild Cognitive Impairment , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[49]  Brian MacWhinney,et al.  AphasiaBank: A Resource for Clinicians , 2012, Seminars in Speech and Language.

[50]  Frank Rudzicz,et al.  Using text and acoustic features to diagnose progressive aphasia and its subtypes , 2013, INTERSPEECH.

[51]  M. Albert,et al.  Manual of Aphasia and Aphasia Therapy , 2013 .

[52]  Isabel Trancoso,et al.  Automatic word naming recognition for treatment and assessment of aphasia , 2012, INTERSPEECH.

[53]  Siti Salwah Salim,et al.  Exploring the influence of general and specific factors on the recognition accuracy of an ASR system for dysarthric speaker , 2015, Expert Syst. Appl..

[54]  C M Shewan,et al.  Reliability and validity characteristics of the Western Aphasia Battery (WAB). , 1980, The Journal of speech and hearing disorders.

[55]  James R. Glass,et al.  Pronunciation assessment via a comparison-based system , 2013, SLaTE.