论文信息 - Experiments with voice modelling in speech synthesis

Experiments with voice modelling in speech synthesis

Abstract Some experiments with voice modelling using recent developments of the KTH speech synthesis system will be presented. A new synthesizer, GLOVE, an extended version of OVE III has been implemented in the system. It contains an improved glottal source built on the LF voice source model, some extra control parameters for the voiced and noise sources and an extra pole/zero-pair in the nasal branch. Furthermore, the present research versions of the KTH text-to-speech system include possibilities for interactive manipulations at the parameter level with on-screen reference to natural speech. The synthesis system constitutes a flexible environment for voice modelling experiments. The new synthesis tools and models were used for synthesis-by-analysis experiments. A sentence uttered by a female speaker was analysed and a stylized copy was made using both the old and the new synthesis system. With the new system the synthetic copy sounded very similar to the natural utterance.

Rolf Carlson | Björn Granström | Inger Karlsson

[1] Christer Gobl,et al. Voice source rules for text-to-speech synthesis , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[2] Inger Karlsson. A female voice for a text-to-speech system , 1989, EUROSPEECH.

[3] Rolf Carlson,et al. Evaluation and development of the KTH text-to-speech system on the segmental level , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[4] Sheri Hunnicutt,et al. A text-to-speech system for british English, and issues of dialect and style , 1987, ECST.

[5] J. C. Liljencrants. The OVE III speech synthesizer , 1968 .

[6] Donald G. Childers,et al. Formant speech synthesis: improving production quality , 1989, IEEE Trans. Acoust. Speech Signal Process..

[7] D. Klatt,et al. Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.

[8] Lennart Nord,et al. Prosodic and segmental speaker variations , 1991, Speech Commun..

[9] Rolf Carlson,et al. The KTH speech database , 1990, Speech Commun..

[10] Lennart Nord,et al. Perceptual tests using an interactive source filter model and considerations for synthesis strategies , 1986 .