A new Chinese text-to-speech system with high naturalness

Introduces a new Chinese text-to-speech system that produces far more natural and intelligible synthesized speech than existing systems. There are two distinguishing features in this system. One is the perfect prosodic rules that are made up from the linguistic knowledge and statistical results from a standard Chinese database. These rules are successfully used to modify the elemental synthesis units to get high naturalness while concatenating them into a sentence. The other feature is that the log-magnitude approximate (LMA) filter is used as the synthesis filter in the system. With the LMA filter, the prosody of the synthesized speech can be modified in a wide range while maintaining high intelligence and naturalness. In this paper, the formulated prosodic rules are presented and the LMA filter-based speech synthesis is described in detail.

[1]  Ren-Hua Wang,et al.  USTC95-a Putonghua corpus , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[2]  S. Imai Log-Magnitude Approximation (LMA) filter , 1980 .

[3]  S. Imai Speech analysis synthesis system using the log magnitude approximation filter , 1978 .