Decision tree based Mandarin tone model and its application to speech recognition

Tone is an essential language phenomenon for Mandarin Chinese language. Until now, we still do not know exactly how context affects tone pattern variation in continuous Mandarin speech. In this paper, we proposed a decision tree based approach to obtain the quantitative result of tone pattern variation in continuous Mandarin speech. Many possible factors other than tone of neighboring syllables were taken into consideration when the decision tree was constructed, After the tree was established, 29 tone patterns were automatically obtained, and we found that syllable position in the word together with consonant/vowel type of the syllable made an important contribution to tone pattern variation in continuous utterance. We also presented a novel approach to integrate tone information into the search process at word level. Experimental results showed that the character error rate was reduced by 15.2%.

[1]  Michael Picheny,et al.  Decision trees for phonological rules in continuous speech , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[2]  Bin Ma,et al.  Context-dependent acoustic models for Chinese speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[3]  Chiu-yu Tseng,et al.  Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary but limited training data , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[4]  Hiromichi Kawanami,et al.  Modeling carryover and anticipation effects for Chinese tone recognition , 1999, EUROSPEECH.

[5]  A. Noll Cepstrum pitch determination. , 1967, The Journal of the Acoustical Society of America.

[6]  Sin-Horng Chen,et al.  An RNN-based prosodic information synthesizer for Mandarin text-to-speech , 1998, IEEE Trans. Speech Audio Process..

[7]  Sin-Horng Chen,et al.  Tone recognition of continuous Mandarin speech based on neural networks , 1995, IEEE Trans. Speech Audio Process..