论文信息 - Automatic Analysis and Synthesis of Fujisaki's Intonation Model for TTS

Automatic Analysis and Synthesis of Fujisaki's Intonation Model for TTS

This paper deals with the automatic analysis and synthesis of intonation using Fujisaki’s model. Both the accent commands and the phrase commands are related to the accent group. We propose an analysis method which imposes strong linguistic constraints. This method gives good results when compared to other current methods and is the best option for synthesis (at least when using accent groups). For synthesis, several prediction algorithms are evaluated. The results show that VCART (an extension of CART to predict vector values) gives the best performance when compared with standard CART or with Neural Networks. The paper also analyzes which features are more relevant to predict the parameters of Fujisaki’s model.

Antonio Bonafonte | Klaus Wimmer

[1] David Escudero Mancebo,et al. Corpus based extraction of quantitative prosodic parameters of stress groups in Spanish , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2] Keikichi Hirose,et al. Analysis of voice fundamental frequency contours for declarative sentences of Japanese , 1984 .

[3] Keikichi Hirose,et al. A method for automatic extraction of model parameters from fundamental frequency contours of speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4] Matthias Pätzold,et al. Analysis and synthesis of German F0 contours by means of Fujisaki's model , 1993, Speech Commun..

[5] Keikichi Hirose,et al. Automatic extraction of model parameters from fundamental frequency contours of English utterances , 2002, INTERSPEECH.

[6] Shuichi Narusawa,et al. Pre-processing of fundamental frequency contours of speech for automatic parameter extraction , 2000, WCC 2000 - ICSP 2000. 2000 5th International Conference on Signal Processing Proceedings. 16th World Computer Congress 2000.

[7] Inma Hernáez,et al. Basque intonation modelling for text to speech conversion , 2002, INTERSPEECH.

[8] Hansjörg Mixdorff,et al. A novel approach to the fully automatic extraction of Fujisaki model parameters , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[9] Hansjörg Mixdorff,et al. Implementing and evaluating an integrated approach to modeling German prosody , 2001, SSW.