A miniature Chinese TTS system based on tailored corpus

Miniature Text to Speech (TTS) systems are broadly applied to embedded system and speech chip, where limited resource requires the corpus to be relatively small and the computing complexity to be low. In general, speech synthesized by conventional miniature TTS systems lacks naturalness due to the limitation of corpus size. In this paper, a method of automatic building a small corpus from a large speech database is described. A new way of distance measurement among candidate instances is also proposed. Based on the tailored corpus, a miniature Chinese TTS system is built, which can produce speech with high naturalness.

[1]  Justin Fackrell,et al.  Segment selection in the L&h Realspeak laboratory TTS system , 2000, INTERSPEECH.

[2]  Alex Acero,et al.  Automatic generation of synthesis units for trainable text-to-speech systems , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[3]  Robert E. Donovan,et al.  A new distance measure for costing spectral discontinuities in concatenative speech synthesizers , 2001, SSW.

[4]  Paul Taylor,et al.  Automatically clustering similar units for unit selection in speech synthesis , 1997, EUROSPEECH.

[5]  Phillip Taylor,et al.  Concept-to-speech synthesis by phonological structure matching , 2000, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[6]  Silvia Quazza,et al.  Choose the best to modify the least: a new generation concatenative synthesis system , 1999, EUROSPEECH.

[7]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..