论文信息 - Syllable HMM based Mandarin TTS and comparison with concatenative TTS

Syllable HMM based Mandarin TTS and comparison with concatenative TTS

This paper introduces a Syllable HMM based Mandarin TTS system. 10-state left-to-right HMMs are used to model each syllable. We leverage the corpus and the front end of a concatenative TTS system to build the Syllable HMM based TTS system. Furthermore, we utilize the unique consonant/vowel structure of Mandarin syllable to improve the voiced/unvoiced decision of HMM states. Evaluation results show that the Syllable HMM based Mandarin TTS system with a 5.3MB’s model size can achieve an overall quality close to a concatenative TTS system with 1GB’ data size.

Zhiwei Shuang | Yong Qin | Lianhong Cai | Qin Shi | Shiyin Kang

[1] Danning Jiang,et al. Overview of the IBM Mandarin Text-to-Speech System , 2006 .

[2] Keiichi Tokuda,et al. Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis , 1999, EUROSPEECH.

[3] Keiichi Tokuda,et al. Multi-Space Probability Distribution HMM , 2002 .

[4] Ren-Hua Wang,et al. USTC System for Blizzard Challenge 2006 an Improved HMM-based Speech Synthesis Method , 2006, Blizzard Challenge.