论文信息 - Tempo Control in Speech Synthesis by Prosodic Phrasing

Tempo Control in Speech Synthesis by Prosodic Phrasing

Tempo control in most speech synthesisers is performed by linear time-scaling although tempo change in human speech shows a non-linear nature. In a perception experiment with a German speech synthesiser it was found that the versions with adjusted prosodic breaks and pauses are preferred over the linear versions for two fast rates and particularly for "very slow". However, the model for "rather slow" needs a refined syntax-prosody mapping.

Jürgen TROUVAIN

[1] Malcolm Slaney,et al. MACH1: nonuniform time-scale modification of speech , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2] D. Jeffery Higginbotham,et al. Discourse comprehension of synthetic speech delivered at normal and slow presentation rates , 1994 .

[3] Brigitte Zellner-Keller. Prediction of Temporal Structures for Various Speech Rates , 1999 .

[4] Frieda Goldman Eisler. Psycholinguistics : experiments in spontaneous speech , 1968 .

[5] Stefanie Shattuck-Hufnagel,et al. The Use of Prosody in Syntactic Disambiguation , 1991, HLT.

[6] Marc Schröder,et al. The German Text-to-Speech Synthesis System MARY: A Tool for Research, Development and Teaching , 2003, Int. J. Speech Technol..

[7] David B. Pisoni,et al. Text-to-speech: the mitalk system , 1987 .