Automatic intelligibility assessment of speakers after laryngeal cancer by means of acoustic modeling.

OBJECTIVE One aspect of voice and speech evaluation after laryngeal cancer is acoustic analysis. Perceptual evaluation by expert raters is a standard in the clinical environment for global criteria such as overall quality or intelligibility. So far, automatic approaches evaluate acoustic properties of pathologic voices based on voiced/unvoiced distinction and fundamental frequency analysis of sustained vowels. Because of the high amount of noisy components and the increasing aperiodicity of highly pathologic voices, a fully automatic analysis of fundamental frequency is difficult. We introduce a purely data-driven system for the acoustic analysis of pathologic voices based on recordings of a standard text. METHODS Short-time segments of the speech signal are analyzed in the spectral domain, and speaker models based on this information are built. These speaker models act as a clustered representation of the acoustic properties of a person's voice and are thus characteristic for speakers with different kinds and degrees of pathologic conditions. The system is evaluated on two different data sets with speakers reading standardized texts. One data set contains 77 speakers after laryngeal cancer treated with partial removal of the larynx. The other data set contains 54 totally laryngectomized patients, equipped with a Provox shunt valve. Each speaker was rated by five expert listeners regarding three different criteria: strain, voice quality, and speech intelligibility. RESULTS/CONCLUSION We show correlations for each data set with r and ρ≥0.8 between the automatic system and the mean value of the five raters. The interrater correlation of one rater to the mean value of the remaining raters is in the same range. We thus assume that for selected evaluation criteria, the system can serve as a validated objective support for acoustic voice and speech analysis.

[1]  J M Festen,et al.  Aero-acoustics of silicone rubber lip reeds for alternative voice production in laryngectomees. , 2001, The Journal of the Acoustical Society of America.

[2]  VectorRegressionAlex J. Smola A Tutorial on Support Vector Regression Produced as Part of the Esprit Working Group in Neural and Computational Learning Ii, Neurocolt2 27150 , 1998 .

[3]  Jonathan C. Irish,et al.  Postlaryngectomy Voice Rehabilitation: State of the Art at the Millennium , 2003, World Journal of Surgery.

[4]  P. Dejonckere,et al.  A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques , 2001, European Archives of Oto-Rhino-Laryngology.

[5]  P McCaffrey,et al.  Listener ratings of the intelligibility of tracheoesophageal speech in noise. , 1998, Journal of communication disorders.

[6]  U. Hoppe,et al.  Quality of life in dysphonic patients. , 2005, Journal of voice : official journal of the Voice Foundation.

[7]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[8]  A. E. Harrison Speech Disorders: Causes, Treatment and Social Effects , 2013 .

[9]  F. Rosanowski,et al.  Objective Voice Parameters and Self-Perceived Handicap in Dysphonia , 2010, Folia Phoniatrica et Logopaedica.

[10]  R. Woodworth Archives of psychology , 2010 .

[11]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[12]  I. Titze,et al.  Populations in the U.S. workforce who rely on voice as a primary tool of trade: a preliminary report. , 1997, Journal of voice : official journal of the Voice Foundation.

[13]  H. K. Schutte,et al.  Aerodynamics of Esophageal Voice Production with and without a Groningen Voice Prosthesis , 2002, Folia Phoniatrica et Logopaedica.

[14]  V. Wolfe,et al.  Perception of dysphonic voice quality by naive listeners. , 2000, Journal of speech, language, and hearing research : JSLHR.

[15]  R. Likert “Technique for the Measurement of Attitudes, A” , 2022, The SAGE Encyclopedia of Research Design.

[16]  Florien J Koopmans-van Beinum,et al.  Perceptual evaluation of tracheoesophageal speech by naive and experienced judges through the use of semantic differential scales. , 2003, Journal of speech, language, and hearing research : JSLHR.

[17]  Jack J Jiang,et al.  Acoustic analysis of aperiodic voice: perturbation and nonlinear dynamic properties in esophageal phonation. , 2009, Journal of voice : official journal of the Voice Foundation.

[18]  Raymond D. Kent,et al.  Listener agreement for auditory-perceptual ratings of dysarthria. , 2007, Journal of speech, language, and hearing research : JSLHR.

[19]  Jean-Pierre Martens,et al.  Objective evaluation of the quality of substitution voices , 2004, European Archives of Oto-Rhino-Laryngology and Head & Neck.

[20]  Elmar Nöth,et al.  PEAKS - A system for the automatic evaluation of voice and speech disorders , 2009, Speech Commun..

[21]  Jack J. Jiang,et al.  Acoustic analyses of sustained and running voices from patients with laryngeal pathologies. , 2008, Journal of voice : official journal of the Voice Foundation.

[22]  J. Kreiman,et al.  Perceptual evaluation of voice quality: review, tutorial, and a framework for future research. , 1993, Journal of speech and hearing research.

[23]  W A Ainsworth,et al.  Perceptual comparison of neoglottal, oesophageal and normal speech. , 1992, Folia phoniatrica.

[24]  D. McColl Intelligibility of tracheoesophageal speech in noise. , 2006, Journal of voice : official journal of the Voice Foundation.

[25]  Raymond D. Kent,et al.  Intelligibility in speech disorders : theory, measurement, and management , 1992 .

[26]  P. Doyle,et al.  Influence of speaker gender on listener judgments of tracheoesophageal speech. , 2008, Journal of voice : official journal of the Voice Foundation.

[27]  N. Hogikyan,et al.  Validation of an instrument to measure voice-related quality of life (V-RQOL). , 1999, Journal of voice : official journal of the Voice Foundation.

[28]  I. Guimarães,et al.  Voice quality after supracricoid laryngectomy and total laryngectomy with insertion of voice prosthesis. , 2009, Journal of voice : official journal of the Voice Foundation.

[29]  Irma Verdonck-de Leeuw,et al.  Acoustical analysis of tracheoesophageal voice , 2005, Speech Commun..

[30]  K. Harrington,et al.  Voice-related Quality of Life in laryngectomees: assessment using the VHI and V-RQOL symptom scales. , 2007, Journal of voice : official journal of the Voice Foundation.

[31]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[32]  F. Hilgers,et al.  A new low‐resistance, self‐retaining prosthesis (Provox™) for voice rehabilitation after total laryngectomy , 1990, The Laryngoscope.

[33]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[34]  E Fresnel-Elbaz,et al.  Differentiated perceptual evaluation of pathological voice quality: reliability and correlations with acoustic measurements. , 1996, Revue de laryngologie - otologie - rhinologie.

[35]  R. Ruben,et al.  Redefining the survival of the fittest: communication disorders in the 21st century. , 1999, International journal of pediatric otorhinolaryngology.

[36]  Elmar Nöth,et al.  Automatic Evaluation of Tracheoesophageal Substitute Voice: Sustained Vowel versus Standard Text , 2009, Folia Phoniatrica et Logopaedica.

[37]  Nicholas Schiavetti,et al.  1. Scaling procedures for the measurement of speech intelligibility , 1992 .

[38]  G. Devins,et al.  Psychosocial Impact of Laryngectomy Mediated by Perceived Stigma and Illness Intrusiveness* , 1994, Canadian journal of psychiatry. Revue canadienne de psychiatrie.

[39]  C. Sheard,et al.  Reliability and agreement of ratings of ataxic dysarthric speech samples with varying intelligibility. , 1991, Journal of speech and hearing research.

[40]  L. Cronbach Coefficient alpha and the internal structure of tests , 1951 .

[41]  B. Fritzell,et al.  Voice disorders and occupations , 1996 .

[42]  D. Kraus,et al.  Measuring quality of life in dysphonic patients: a systematic review of content development in patient-reported outcomes measures. , 2010, Journal of voice : official journal of the Voice Foundation.

[43]  F. Hilgers,et al.  Acoustical analysis and perceptual evaluation of tracheoesophageal prosthetic voice. , 1998, Journal of voice : official journal of the Voice Foundation.