Intelligibility Assessment and Speech Recognizer Word Accuracy Rate Prediction for Dysarthric Speakers in a Factor Analysis Subspace

Automated intelligibility assessments can support speech and language therapists in determining the type of dysarthria presented by their clients. Such assessments can also help predict how well a person with dysarthria might cope with a voice interface to assistive technology. Our approach to intelligibility assessment is based on iVectors, a set of measures that capture many aspects of a person’s speech, including intelligibility. The major advantage of iVectors is that they compress all acoustic information contained in an utterance into a reduced number of measures, and they are very suitable to be used with simple predictors. We show that intelligibility assessments work best if there is a pre-existing set of words annotated for intelligibility from the speaker to be evaluated, which can be used for training our system. We discuss the implications of our findings for practice.

[1]  Fraser Shein,et al.  Characterization of atypical vocal source excitation, temporal dynamics and prosody for objective measurement of dysarthric word intelligibility , 2012, Speech Commun..

[2]  D. Sheskin The Pearson Product-Moment Correlation Coefficient , 2003 .

[3]  Phil D. Green,et al.  Automatic speech recognition with sparse training data for dysarthric speakers , 2003, INTERSPEECH.

[4]  H. A. Leeper,et al.  Dysarthric speech: a comparison of computerized speech recognition and listener intelligibility. , 1997, Journal of rehabilitation research and development.

[5]  Carmichael Jn,et al.  Introducing objective acoustic metrics for the frenchay dysarthria assessment procedure. , 2007 .

[6]  Lukás Burget,et al.  Language Recognition in iVectors Space , 2011, INTERSPEECH.

[7]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[8]  Fred L. Collopy,et al.  Error Measures for Generalizing About Forecasting Methods: Empirical Comparisons , 1992 .

[9]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[10]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[11]  P. Enderby Disorders of communication: dysarthria. , 2013, Handbook of clinical neurology.

[12]  Tino Haderlein,et al.  EVALUATION AND ASSESSMENT OF SPEECH INTELLIGIBILITY ON PAT HOLOGIC VOICES BASED UPON ACOUSTIC SPEAKER MODELS , 2009 .

[13]  Jean-Pierre Martens,et al.  Automated Intelligibility Assessment of Pathological Speech Using Phonological Features , 2009, EURASIP J. Adv. Signal Process..

[14]  Jon Gunderson,et al.  UNIVERSAL ACCESS : PRELIMINARY EXPERIMENTS IN DYSARTHRIC SPEECH RECOGNITION Harsh , 2008 .

[15]  P. Enderby,et al.  Frenchay Dysarthria Assessment , 1983 .

[16]  E. Owens,et al.  An Introduction to the Psychology of Hearing , 1997 .

[17]  B. Moore An introduction to the psychology of hearing, 3rd ed. , 1989 .

[18]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[19]  Chih-Jen Lin,et al.  Training v-Support Vector Regression: Theory and Algorithms , 2002, Neural Computation.

[20]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[21]  C. Ludlow,et al.  Manual of Nerve Conduction Velocity and Clinical Neurophysiology, 3rd Ed. , 1994, Neurology.

[22]  Elmar Nöth,et al.  Automatic intelligibility assessment of speakers after laryngeal cancer by means of acoustic modeling. , 2012, Journal of voice : official journal of the Voice Foundation.

[23]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[24]  Raymond D. Kent The MIT Encyclopedia of Communication Disorders , 2003 .

[25]  Janet M. Baker,et al.  The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.

[26]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[27]  Patrick Kenny,et al.  Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[28]  Patrick Kenny,et al.  Speaker and Session Variability in GMM-Based Speaker Verification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[29]  Heidi Christensen,et al.  A comparative study of adaptive, automatic recognition of disordered speech , 2012, INTERSPEECH.

[30]  Heidi Christensen,et al.  Dysarthria intelligibility assessment in a factor analysis total variability space , 2013, INTERSPEECH.

[31]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[32]  J. Martens,et al.  Speech technology-based assessment of phoneme intelligibility in dysarthria. , 2009, International journal of language & communication disorders.

[33]  Marc S De Bodt,et al.  Intelligibility as a linear combination of dimensions in dysarthric speech. , 2002, Journal of communication disorders.

[34]  Thomas S. Huang,et al.  Dysarthric speech database for universal access research , 2008, INTERSPEECH.

[35]  Tiago H. Falk,et al.  Quantifying perturbations in temporal dynamics for automated assessment of spastic dysarthric speech intelligibility , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[36]  Frank Rudzicz,et al.  Using acoustic measures to predict automatic speech recognition performance for dysarthric speakers , 2011, MAVEBA.

[37]  John-Paul Hosom,et al.  Improving the intelligibility of dysarthric speech , 2007, Speech Commun..

[38]  Phil D. Green,et al.  Revisiting dysarthria assessment intelligibility metrics , 2004, INTERSPEECH.

[39]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[40]  Foad Hamidi,et al.  CanSpeak: A Customizable Speech Interface for People with Dysarthric Speech , 2010, ICCHP.

[41]  Tiago H. Falk,et al.  Automated Dysarthria Severity Classification for Improved Objective Intelligibility Assessment of Spastic Dysarthric Speech , 2012, INTERSPEECH.

[42]  Elmar Nöth,et al.  Combining Phonological and Acoustic ASR-Free Features for Pathological Speech Intelligibility Assessment , 2011, INTERSPEECH.

[43]  John-Paul Hosom,et al.  Intelligibility of modifications to dysarthric speech , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[44]  F. Darley,et al.  Diagnosis of Motor Speech Disorders , 1975 .

[45]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..