Towards a clinical tool for automatic intelligibility assessment

An important, yet under-explored, problem in speech processing is the automatic assessment of intelligibility for pathological speech. In practice, intelligibility assessment is often done through subjective tests administered by speech pathologists; however research has shown that these tests are inconsistent, costly, and exhibit poor reliability. Although some automatic methods for intelligibility assessment for telecommunications exist, research specific to pathological speech has been limited. Here, we propose an algorithm that captures important multi-scale perceptual cues shown to correlate well with intelligibility. Nonlinear classifiers are trained at each time scale and a final intelligibility decision is made using ensemble learning methods from machine learning. Preliminary results indicate a marked improvement in intelligibility assessment over published baseline results.

[1]  Visar Berisha,et al.  Wideband Speech Recovery Using Psychoacoustic Criteria , 2007, EURASIP J. Audio Speech Music. Process..

[2]  K M Yorkston,et al.  Treatment efficacy: dysarthria. , 1996, Journal of speech and hearing research.

[3]  Julie M Liss,et al.  The effects of familiarization on intelligibility and lexical segmentation in hypokinetic and ataxic dysarthria. , 2002, The Journal of the Acoustical Society of America.

[4]  Joseph Sill,et al.  Feature-Weighted Linear Stacking , 2009, ArXiv.

[5]  Dennis H. Klatt,et al.  Prediction of perceived phonetic distance from critical-band spectra: A first step , 1982, ICASSP.

[6]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[7]  Stephanie A. Borrie,et al.  Perceptual learning of dysarthric speech: a review of experimental studies. , 2012, Journal of speech, language, and hearing research : JSLHR.

[8]  S. Voran,et al.  Estimation of perceived speech quality using measuring normalizing blocks , 1997, 1997 IEEE Workshop on Speech Coding for Telecommunications Proceedings. Back to Basics: Attacking Fundamental Problems in Speech Coding.

[9]  Richard Hummel,et al.  Objective Estimation of Dysarthric Speech Intelligibility , 2011 .

[10]  J. Liss,et al.  Discriminating dysarthria type from envelope modulation spectra. , 2010, Journal of speech, language, and hearing research : JSLHR.

[11]  Minsoo Hahn,et al.  Automatic Assessment of Pathological Voice Quality Using Higher-Order Statistics in the LPC Residual Domain , 2009, EURASIP J. Adv. Signal Process..

[12]  Methods for objective and subjective assessment of quality Perceptual evaluation of speech quality ( PESQ ) : An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs , 2002 .

[13]  David R. Beukelman,et al.  Evidence for effectiveness of treatment of loudness, rate, or prosody in dysarthria: a systematic review , 2007 .

[14]  H. A. Leeper,et al.  Dysarthric speech: a comparison of computerized speech recognition and listener intelligibility. , 1997, Journal of rehabilitation research and development.

[15]  Fraser Shein,et al.  Characterization of atypical vocal source excitation, temporal dynamics and prosody for objective measurement of dysarthric word intelligibility , 2012, Speech Commun..

[16]  P. Langhorne,et al.  Speech and language therapy for dysarthria due to non-progressive brain damage. , 2005, The Cochrane database of systematic reviews.

[17]  Marc S De Bodt,et al.  Intelligibility as a linear combination of dimensions in dysarthric speech. , 2002, Journal of communication disorders.