论文信息 - A synthesis method based on concatenation of demisyllables and a residual excited vocal tract model

A synthesis method based on concatenation of demisyllables and a residual excited vocal tract model

This paper describes the back-end of a new, flexible, high-quality TTS system. Preliminary results have demonstrated a highly natural and intelligible output. Although the system follows some standard methodologies, such as concatenation, we have introduced a number of novel features and a combination of techniques that make our system unique. We will describe in detail many of the design decisions and compare them with other known systems. A demonstration of the speech quality with implanted prosody is available in waveform file ([WAVE stltts1.wav and stltts2.wav]) on the conference CD.

Nancy Niedzielski | Steve Pearson | Nick Kibre

[1] David B. Pisoni,et al. Text-to-speech: the mitalk system , 1987 .

[2] Kazue Hata,et al. Combining concatenation and formant synthesis for improved intelligibility and naturalness in text-to-speech systems , 1994, Int. J. Speech Technol..

[3] Tadashi Kitamura,et al. A direct approximation technique of log magnitude response for digital filters , 1977 .

[4] A. U.S.,et al. COMBINATORIAL ISSUES IN TEXT-TO-SPEECH SYNTHESIS , 1997 .

[5] Eric Moulines,et al. Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones , 1989, Speech Commun..

[6] Xavier Serra,et al. A sound analysis/synthesis system based on a deterministic plus stochastic decomposition , 1990 .

[7] Kenji Matsui,et al. Improving naturalness in text-to-speech synthesis using natural glottal source , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[8] Kenji Matsui,et al. Text-to-speech synthesis using a natural voice source , 1990, ICSLP.

[9] Julius O. Smith,et al. Spectral modeling synthesis: A sound analysis/synthesis based on a deterministic plus stochastic decomposition , 1990 .