Combining Phonological and Acoustic ASR-Free Features for Pathological Speech Intelligibility Assessment

Intelligibility is widely used to measure the severity of articulatory problems in pathological speech. Recently, a number of automatic intelligibility assessment tools have been developed. Most of them use automatic speech recognizers (ASR) to compare the patient's utterance with the target text. These methods are bound to one language and tend to be less accurate when speakers hesitate or make reading errors. To circumvent these problems, two different ASR-free methods were developed over the last few years, only making use of the acoustic or phonological properties of the utterance. In this paper, we demonstrate that these ASR-free techniques are also able to predict intelligibility in other languages. Moreover, they show to be complementary, resulting in even better intelligibility predictions when both methods are combined.

[1]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[2]  Jean-Pierre Martens,et al.  DIA: a tool for objective intelligibility assessment of pathological speech , 2009, MAVEBA.

[3]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[4]  Yvan Saeys,et al.  Towards an ASR-free objective analysis of pathological speech , 2010, INTERSPEECH.

[5]  Tino Haderlein,et al.  EVALUATION AND ASSESSMENT OF SPEECH INTELLIGIBILITY ON PAT HOLOGIC VOICES BASED UPON ACOUSTIC SPEAKER MODELS , 2009 .

[6]  Ian H. Witten,et al.  Weka: Practical machine learning tools and techniques with Java implementations , 1999 .

[7]  Dirk Van Compernolle,et al.  CoGeN een corpus gesproken Nederlands voor spraaktechnologisch onderzoek , 1997 .

[8]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[9]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[10]  Elmar Nöth,et al.  PEAKS - A system for the automatic evaluation of voice and speech disorders , 2009, Speech Commun..

[11]  R. Likert “Technique for the Measurement of Attitudes, A” , 2022, The SAGE Encyclopedia of Research Design.

[12]  J. Martens,et al.  Speech technology-based assessment of phoneme intelligibility in dysarthria. , 2009, International journal of language & communication disorders.