The spectral dynamics of vowels in Mandarin Chinese

This study investigated the dynamic spectral patterns of vowels in Mandarin Chinese using a corpus of monosyllabic words spoken in isolation. Mel-frequency cepstral coefficients (MFCCs) were parameterized in different ways to test the nature of the dynamic information in vowels through automatic vowel classification. Compared to the MFCCs extracted at the vowel midpoint, using the MFCCs extracted at two or three points (vowel onset, offset, and midpoint) greatly improved classification accuracies. Legendre polynomials fitted to the MFCCs over the entire vowel duration achieved approximately 30% relative error reductions over the threepoint model. Euclidean cepstral distance was employed to measure the magnitude of spectral change. A negative correlation was found between the rate of spectral change and vowel duration. Vowel-dependent spectral changes appear primarily in the first half of a vowel. There is great diversity among the diphthongs and a considerable overlap between the diphthongs and the monophthongs in terms of the spectral dynamics.

[1]  Michael A. Gottfried,et al.  Three approaches to the classification of American English diphthongs , 1993 .

[2]  S. Zahorian,et al.  Spectral-shape features versus formants as acoustic correlates for vowels. , 1993, The Journal of the Acoustical Society of America.

[3]  T. M. Nearey,et al.  Effects of consonant environment on vowel formant patterns. , 1997, The Journal of the Acoustical Society of America.

[4]  Coarticulation • Suprasegmentals,et al.  Acoustic Phonetics , 2019, The SAGE Encyclopedia of Human Communication Sciences and Disorders.

[5]  Wai-Sum Lee,et al.  An acoustical analysis of the vowels in beijing Mandarin , 2001, INTERSPEECH.

[6]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[7]  J. Howie,et al.  Acoustical Studies of Mandarin Vowels and Tones , 1976 .

[8]  San Duanmu,et al.  The Phonology of Standard Chinese , 2001 .

[9]  W. Strange,et al.  Identification of coarticulated vowels. , 1980, The Journal of the Acoustical Society of America.

[10]  J. Hillenbrand,et al.  Acoustic characteristics of American English vowels. , 1994, The Journal of the Acoustical Society of America.

[11]  C. Browman,et al.  Articulatory Phonology: An Overview , 1992, Phonetica.

[12]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[13]  B. Lindblom,et al.  Interaction between duration, context, and speaking style in English stressed vowels , 1994 .

[14]  D. Shankweiler,et al.  Consonant environment specifies vowel identity. , 1976, The Journal of the Acoustical Society of America.

[15]  J Harrington,et al.  Acoustic evidence for dynamic formant trajectories in Australian English vowels. , 1999, The Journal of the Acoustical Society of America.

[16]  D. Klatt,et al.  Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.

[17]  Xi Xin The development of Mandarin-Chinese phonetically balanced monosyllable words test lists , 2007 .

[18]  T. Gay Effect of speaking rate on diphthong formant movements. , 1968, The Journal of the Acoustical Society of America.

[19]  J. Jenkins,et al.  Dynamic specification of coarticulated vowels. , 1983, The Journal of the Acoustical Society of America.

[20]  Jonathan Harrington,et al.  Dynamic and Target Theories of Vowel Classification: Evidence from Monophthongs and Diphthongs in Australian English , 1994 .

[21]  Louis C. W. Pols,et al.  Spectral analysis and identification of Dutch vowels in monosyllabic words , 1977 .