A new Japanese TTS system based on speech-prosody database and speech modification

This paper describes a new Japanese text-to-speech (TTS) system that can produce highly natural and intelligible synthetic speech. The good performance of the new TTS system derives from three new sophisticated approaches as follows; (1)A new prosody control algorithm that uses prosody data extracted from a natural speech database and a duration control algorithm based on statistical estimation. (2)A new type of synthesis unit that consists of a consonant with following vowel chain. The unit suppresses unnatural sounds and acoustic discontinuities at concatenation points by preparing synthesis units with various lengths and various F0 contours. (3)A new speech modification algorithm with harmonics reconstruction. To evaluate the new modules and the total performance of the new TTS system, listening tests are carried out. The results confirm that the new modules work together effectively, and that the new TTS system can produce high quality synthesized speech.