Articulation Entropy: An Unsupervised Measure of Articulatory Precision

Articulatory precision is a critical factor that influences speaker intelligibility. In this letter, we propose a new measure we call “articulation entropy” that serves as a proxy for the number of distinct phonemes a person produces when he or she speaks. The method is based on the observation that the ability of a speaker to achieve an articulatory target, and hence clearly produce distinct phonemes, is related to the variation of the distribution of speech features that capture articulation—the larger the variation, the larger the number of distinct phonemes produced. In contrast to previous work, the proposed method is completely unsupervised, does not require phonetic segmentation or formant estimation, and can be estimated directly from continuous speech. We evaluate the performance of this measure with several experiments on two data sets: a database of English speakers with various neurological disorders and a database of Mandarin speakers with Parkinson's disease. The results reveal that our measure correlates with subjective evaluation of articulatory precision and reveals differences between healthy individuals and individuals with neurological impairment.

[1]  Lorraine O. Ramig,et al.  Acoustic metrics of vowel articulation in Parkinson's disease: vowel space area (VSA) vs. vowel articulation index (VAI) , 2011, MAVEBA.

[2]  Visar Berisha,et al.  Modeling pathological speech perception from data with similarity labels , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  A. Flint,et al.  Abnormal speech articulation, psychomotor retardation, and subcortical dysfunction in major depression. , 1993, Journal of psychiatric research.

[4]  G. Turner,et al.  Spectral properties of fricatives in amyotrophic lateral sclerosis. , 1997, Journal of speech, language, and hearing research : JSLHR.

[5]  P. Mermelstein,et al.  Speech sounds and features , 1975, Proceedings of the IEEE.

[6]  Raymond D. Kent,et al.  Acoustic and intelligibility characteristics of sentence production in neurogenic speech disorders. , 1997, Folia phoniatrica et logopaedica : official organ of the International Association of Logopedics and Phoniatrics.

[7]  Anthony Bladon Two-formant models of vowel perception: Shortcomings and enhancement , 1983, Speech Commun..

[8]  Anne Smith,et al.  Basic parameters of articulatory movements and acoustics in individuals with Parkinson's disease , 2012, Movement disorders : official journal of the Movement Disorder Society.

[9]  Raymond D. Kent,et al.  Toward an acoustic typology of motor speech disorders , 2003, Clinical linguistics & phonetics.

[10]  Alfred O. Hero,et al.  Empirically Estimable Classification Bounds Based on a Nonparametric Divergence Measure , 2014, IEEE Transactions on Signal Processing.

[11]  K. Tjaden,et al.  Rate and loudness manipulations in dysarthria: acoustic and perceptual findings. , 2004, Journal of speech, language, and hearing research : JSLHR.

[12]  B. Murdoch,et al.  Dynamic assessment of articulation during lingual fatigue in myasthenia gravis , 2006 .

[13]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[14]  S. Spitzer,et al.  Quantifying speech rhythm abnormalities in the dysarthrias. , 2009, Journal of speech, language, and hearing research : JSLHR.

[15]  Visar Berisha,et al.  Automatic assessment of vowel space area. , 2013, The Journal of the Acoustical Society of America.

[16]  Wei Zhang,et al.  Speech enhancement employing Laplacian-Gaussian mixture , 2005, IEEE Transactions on Speech and Audio Processing.

[17]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[18]  Shrikanth S. Narayanan,et al.  Primitives-based evaluation and estimation of emotions in speech , 2007, Speech Commun..

[19]  Alfred O. Hero,et al.  Empirical Non-Parametric Estimation of the Fisher Information , 2014, IEEE Signal Processing Letters.

[20]  L. Györfi,et al.  Nonparametric entropy estimation. An overview , 1997 .

[21]  Hisao Kuwabara,et al.  Acoustic properties of phonemes in continuous speech for different speaking rate , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[22]  Jennifer L. Spielman,et al.  Formant centralization ratio: a proposal for a new acoustic measure of dysarthric speech. , 2010, Journal of speech, language, and hearing research : JSLHR.

[23]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[24]  Mark Hasegawa-Johnson,et al.  Frequency of consonant articulation errors in dysarthric speech , 2010, Clinical linguistics & phonetics.

[25]  J. Logemann,et al.  Vocal Tract Control in Parkinson's Disease , 1981 .

[26]  S Fahn,et al.  Speech dysfunction in early Parkinson's disease , 1995, Movement disorders : official journal of the Movement Disorder Society.

[27]  Alfred O. Hero,et al.  Applications of entropic spanning graphs , 2002, IEEE Signal Process. Mag..

[28]  S. Skodda,et al.  Vowel articulation in Parkinson's disease. , 2011, Journal of voice : official journal of the Voice Foundation.

[29]  Stergios B. Fotopoulos,et al.  Introduction to Modern Nonparametric Statistics , 2004, Technometrics.

[30]  Yishan Jiao,et al.  Towards improving statistical model based voice activity detection , 2014, INTERSPEECH.

[31]  P. Kuhl,et al.  The effect of reduced vowel working space on speech intelligibility in Mandarin-speaking young adults with cerebral palsy. , 2005, The Journal of the Acoustical Society of America.

[32]  H. Timothy Bunnell,et al.  The Nemours database of dysarthric speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[33]  Joon-Hyuk Chang,et al.  Speech enhancement using warped discrete cosine transform , 2002, Speech Coding, 2002, IEEE Workshop Proceedings..

[34]  P. Snyder,et al.  Acoustic characteristics of Parkinsonian speech: a potential biomarker of early disease progression and treatment , 2004, Journal of Neurolinguistics.

[35]  C. Tseng,et al.  Perceptual and acoustic analysis of speech intelligibility in Mandarin-speaking young adults with cerebral palsy , 2000 .

[36]  A. Neel,et al.  Effects of loud and amplified speech on sentence and word intelligibility in Parkinson disease. , 2009, Journal of speech, language, and hearing research : JSLHR.