F/sub 0/ perturbations by consonants and their implications on tone recognition

In this paper, we present a study on variations of fundamental frequency (F/sub 0/) in speech production of Mandarin Chinese, and discuss the implications of our results for automatic tone recognition. Target syllables with different consonants and tones were analyzed in continuous speech samples. Compared to the control syllable /ma/ which shows a smooth transition from the onset F/sub 0/ value to the current tonal target, syllables with voiceless initial consonants have local raisings of F/sub 0/ at voice onset and shorter F/sub 0/ trajectories. The effect known as consonant perturbation on F/sub 0/ is local and does not alter the original course of known contextual tonal variations. Rather, it is superimposed on other effects, and thus contributes to the appearance of surface F/sub 0/ contours. Understanding and consideration of consonant perturbation on F/sub 0/ curves therefore should help improve the performance of current tone recognition systems.

[1]  John J. Ohala,et al.  Production of Tone , 1978 .

[2]  Yi Xu,et al.  Maximum speed of pitch change and how it may relate to speech. , 2002, The Journal of the Acoustical Society of America.

[3]  Xia Wang,et al.  Low complexity Mandarin speaker-independent isolated word recognition , 2002, INTERSPEECH.

[4]  Jean-Marie Humbert,et al.  Consonant Types, Vowel Quality, and Tone , 1978 .

[5]  J. S. Zhang Subsyllabic tone units for reducing physiological effects in automatic tone recognition for connected Mandarin Chinese , 1999 .

[6]  Yi Xu,et al.  Sources of tonal variations in connected speech , 2001 .

[7]  I. Lehiste chapter 7 – Suprasegmental Features of Speech , 1976 .

[8]  Yi Xu,et al.  Effects of tone and focus on the formation and alignment of f0contours , 1999 .

[9]  Emily Q. Wang,et al.  Pitch targets and their realization: Evidence from Mandarin Chinese , 2001, Speech Commun..

[10]  Julia Hirschberg,et al.  Segmental effects on timing and height of pitch contours , 1994, ICSLP.

[11]  Yi Xu,et al.  A pitch target approximation model for F0 contours in Mandarin , 1999 .

[12]  Kim E. A. Silverman,et al.  F₀ Segmental Cues Depend on Intonation: The Case of the Rise after Voiced Stops , 1986 .

[13]  Keikichi Hirose,et al.  Anchoring hypothesis and its application to tone recognition of Chinese continuous speech , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[14]  D H Whalen,et al.  FO gives voicing information even with unambiguous voice onset times. , 1993, The Journal of the Acoustical Society of America.

[15]  Stephen W. K. Fu,et al.  A Survey on Chinese Speech Recognition , 1995 .

[16]  趙 元任,et al.  A grammar of spoken Chinese = 中國話的文法 , 1968 .

[17]  Yi Xu Contextual tonal variations in Mandarin , 1997 .

[18]  Y Xu,et al.  Consistency of Tone-Syllable Alignment across Different Syllable Structures and Speaking Rates , 1998, Phonetica.

[19]  R. N. Ohde,et al.  Fundamental frequency as an acoustic correlate of stop consonant voicing. , 1984, The Journal of the Acoustical Society of America.