论文信息 - Generating emotional speech with a concatenative synthesizer

Generating emotional speech with a concatenative synthesizer

We describe the attempt to synthesize emotional speech with a concatenative speech synthesizer using a parameter space covering not only f0, duration and amplitude, but also voice quality parameters, spectral energy distribution, harmonics-to-noise ratio, and articulatory precision. The application of these extended parameter set offers the possibility to combine the high segmental quality of concatenative synthesis with a wider range of control settings needed for the synthesis of natural affected speech.

Erhard Rank | Hannes Pirker | H. Pirker | E. Rank | Hannes Pirker

[1] Perry R. Cook,et al. SPASM, a Real-Time Vocal Tract Physical Model Controller; and Singer, the Companion Software Synthesis System , 1993 .

[2] Sharad Singhal,et al. Intelligibility as a function of speech coding method for template-based speech synthesis , 1993, EUROSPEECH.

[3] Julia Hirschberg,et al. Progress in speech synthesis , 1997 .

[4] Harald Trost,et al. VIECTOS- The Vienna Concept to Speech System , 1996, KONVENS.

[5] Janet E. Cahn. Generating expression in synthesized speech , 1989 .

[6] Mike Edgington,et al. Investigating the limitations of concatenative synthesis , 1997, EUROSPEECH.

[7] Erhard Rank,et al. VieCtoS - Speech Synthesizer, Technical Overview , 1998 .

[8] Gudrun Klasmeyer,et al. The perceptual importance of selected voice quality parameters , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9] Barbara Heuft,et al. Emotions in time domain synthesis , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[10] D. Kahn,et al. Pitch modification of speech using a low-sensitivity inverse filter approach , 1998, IEEE Signal Processing Letters.