Classification of taiwanese tones based on pitch and energy movements

This paper addresses the difficulties associated with automatically distinguishing the seven Taiwanese tones. The tone recogniser is an essential component of any automatic speech rec ognition system customised for tone languages such as Taiwanese. We show that it is difficult to distinguish between the Taiwanese tones simply employing the fundamental frequency contours and that the task is simplified by employing energy contour features besides the fundamental frequency features. To allow energy to be accommodated into the classification model an energy-contour feature extraction approach is presented. The proposed approach is inspired by the ADSR model employed in musical instrument synthesis where the envelopes of complex sounds are modeled employing only a few parameters. Our experiments demonstrate that the inclusion of energy into the recognition model allows the seven Taiwanese tones to be discriminated successfully. The paper also presents acoustical measurements of the fundamental frequency and energy features described.

[1]  Francine H. Jian Perception of long and short tones in Taiwanese speech , 1997 .

[2]  Jack Gandour,et al.  On the Interaction between Tone and Vowel Length: Evidence from Thai Dialects , 1977 .

[3]  Peter Davies Hidden Markov modelling of modern standard Chinese tones in connected speech , 1989, EUROSPEECH.

[4]  Douglas A. Vakoch,et al.  Tone perception in Cantonese and Mandarin: A cross-linguistic comparison , 1996, Journal of psycholinguistic research.

[5]  Jack Gandour,et al.  Tone perception in Far Eastern languages. , 1983 .

[6]  X. Shen,et al.  A Perceptual Study of Mandarin Tones 2 and 3 , 1991 .

[7]  Sin-Horng Chen,et al.  Tone recognition of continuous Mandarin speech based on neural networks , 1995, IEEE Trans. Speech Audio Process..

[8]  Frank K. Soong,et al.  Large vocabulary word recognition based on tree-trellis search , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Lai-Wan Chan,et al.  Automatic recognition of Cantonese lexical tones in connected speech by multi-layer perceptron , 1995, EUROSPEECH.

[10]  Chiu-yu Tseng,et al.  Golden Mandarin (I)-A real-time Mandarin speech dictation machine for Chinese language with very large vocabulary , 1993, IEEE Trans. Speech Audio Process..

[11]  Frank K. Soong,et al.  Large vocabulary, word-based Mandarin dictation system , 1995, EUROSPEECH.

[12]  Pak-Chung Ching,et al.  Tone recognition of isolated Cantonese syllables , 1995, IEEE Trans. Speech Audio Process..

[13]  A. J. Rozsypal,et al.  Computer modelling of lexical tone perception , 1991 .

[14]  Keikichi Hirose,et al.  HMM-based tone recognition of Chinese trisyllables using double codebooks on fundamental frequency and waveform power , 1995, EUROSPEECH.

[15]  Keikichi Hirose,et al.  Recognition of Chinese tones in monosyllabic and disyllabic speech using HMM , 1994, ICSLP.

[16]  T. Vance,et al.  Tonal Distinctions in Cantonese , 1977, Phonetica.