Modeling pronunciation, rhythm, and intonation for automatic assessment of speech quality in aphasia rehabilitation

Patients with aphasia often have impaired speech-language production skills, resulting in tremendous difficulties in tasks that require verbal communication. To facilitate rehabilitation outside of therapy, we are collaborating with the University of Michigan Aphasia Program (UMAP) to develop an automated system capable of providing feedback regarding the patient’s verbal output. In this paper we introduce a robust method for extracting rhythm and intonation features from aphasic speech based on template matching. These features, combined with Goodness of Pronunciation (GOP) scores and our previous feature set, help our system achieve human-level performance in classifying the quality of speech produced by patients attending UMAP. The results presented in this work demonstrate the efficacy of our technique and the potential of this system for handling natural speech data recorded in non-ideal conditions as well as the unpredictability in aphasic speech patterns.

[1]  L. Manheim,et al.  Patient-reported changes in communication after computer-based script training for aphasia. , 2009, Archives of physical medicine and rehabilitation.

[2]  Frank Rudzicz,et al.  Using text and acoustic features to diagnose progressive aphasia and its subtypes , 2013, INTERSPEECH.

[3]  Jean-Pierre Martens,et al.  Automated Intelligibility Assessment of Pathological Speech Using Phonological Features , 2009, EURASIP J. Adv. Signal Process..

[4]  Brian Roark,et al.  Spoken Language Derived Measures for Detecting Mild Cognitive Impairment , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Elmar Nöth,et al.  Automatic scoring of the intelligibility in patients with cancer of the oral cavity , 2007, INTERSPEECH.

[6]  Isabel Trancoso,et al.  Automatic word naming recognition for treatment and assessment of aphasia , 2012, INTERSPEECH.

[7]  J. Martens,et al.  Speech technology-based assessment of phoneme intelligibility in dysarthria. , 2009, International journal of language & communication disorders.

[8]  Leora R Cherney,et al.  Computerized script training for aphasia: preliminary results. , 2008, American journal of speech-language pathology.

[9]  Robert Teasell,et al.  Therapeutic Interventions for Aphasia Initiated More than Six Months Post Stroke: A Review of the Evidence , 2012, Topics in stroke rehabilitation.

[10]  Richard C Katz Computers in the treatment of chronic aphasia. , 2010, Seminars in speech and language.

[11]  Alexandre Allauzen,et al.  Using Dynamic Time Warping to Compute Prosodic Similarity Measures , 2011, INTERSPEECH.

[12]  R. Teasell,et al.  Intensity of Aphasia Therapy, Impact on Recovery , 2003, Stroke.

[13]  Elmar Nöth,et al.  Towards robust automatic evaluation of pathologic telephone speech , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[14]  Walter Huber,et al.  Supervised home training of dialogue skills in chronic aphasia: a randomized parallel group study. , 2011, Journal of speech, language, and hearing research : JSLHR.

[15]  Rosalind C. Kaye,et al.  Computer-based script training for aphasia: emerging themes from post-treatment interviews. , 2011, Journal of communication disorders.

[16]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[17]  F. Ramus,et al.  Correlates of linguistic rhythm in the speech signal , 1999, Cognition.

[18]  Serguei V. S. Pakhomov,et al.  Computerized Analysis of Speech and Language to Identify Psycholinguistic Correlates of Frontotemporal Lobar Degeneration , 2010, Cognitive and behavioral neurology : official journal of the Society for Behavioral and Cognitive Neurology.

[19]  K. Hacioglu,et al.  TESTING SUPRASEGMENTAL ENGLISH THROUGH PARROTING , 2010 .

[20]  Jack Gandour,et al.  Dysprosody in Broca's aphasia: A case study , 1989, Brain and Language.

[21]  Luís C. Oliveira,et al.  Jitter Estimation Algorithms for Detection of Pathological Voices , 2009, EURASIP J. Adv. Signal Process..

[22]  Emily Mower Provost,et al.  Automatic analysis of speech quality for aphasia treatment , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[24]  Carlos Gussenhoven,et al.  Durational variability in speech and the Rhythm Class Hypothesis , 2002 .

[25]  Steve J. Young,et al.  Phone-level pronunciation scoring and assessment for interactive language learning , 2000, Speech Commun..

[26]  M P Black,et al.  Automatic Prediction of Children's Reading Ability for High-Level Literacy Assessment , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[27]  Swathi Kiran,et al.  Effect of Verb Network Strengthening Treatment (VNeST) on lexical retrieval of content words in sentences in persons with aphasia , 2009, Aphasiology.

[28]  Phil D. Green,et al.  Revisiting dysarthria assessment intelligibility metrics , 2004, INTERSPEECH.

[29]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Martha Danly,et al.  Speech prosody in Broca's aphasia , 1982, Brain and Language.

[31]  Jon Gunderson,et al.  UNIVERSAL ACCESS : PRELIMINARY EXPERIMENTS IN DYSARTHRIC SPEECH RECOGNITION Harsh , 2008 .

[32]  Sam Tilsena,et al.  Speech rhythm analysis with decomposition of the amplitude envelope: Characterizing rhythmic patterns within and across languages , 2013 .

[33]  Naveen Kumar,et al.  Automatic intelligibility classification of sentence-level pathological speech , 2015, Comput. Speech Lang..