Joint extraction and prediction of fujisaki's intonation model parameters

This paper presents a joint extraction and prediction framework for intonation modeling applied to Fujisaki’s intonation model for text-to-speech conversion. Previous methods in the area extract the parameters of accent and phrase commands for each sentence. Then, these parameters are related to linguistic features for prediction. In our approach commands that share the same linguistic features are globally estimated. This approach intends to overcome some consistency problems of the extracted model parameters. The global nature of the parameter optimization avoids the interpolation step, which sometimes can produce a bias in the extracted parameters. Experimental results show that the higher consistency of the parameters result in a higher accuracy when the fundamental frequency contours are predicted.

[1]  AT BerndMöbius,et al.  COMPONENTS OF A QUANTITATIVE MODEL OF GERMAN INTONATION , 1995 .

[2]  Nancy Ide,et al.  Coding fundamental frequency patterns for multi-lingual synthesis with INTSINT in the MULTEXT project , 1994, Speech Synthesis Workshop.

[3]  P Taylor,et al.  Analysis and synthesis of intonation using the Tilt model. , 2000, The Journal of the Acoustical Society of America.

[4]  Keikichi Hirose,et al.  Use of linguistic information for automatic extraction of f_0 contour generation process model parameters , 2003, INTERSPEECH.

[5]  David Escudero Mancebo,et al.  Experimental evaluation of the relevance of prosodic features in Spanish using machine learning techniques , 2003, INTERSPEECH.

[6]  Keikichi Hirose,et al.  Analysis of voice fundamental frequency contours for declarative sentences of Japanese , 1984 .

[7]  Antonio Bonafonte,et al.  Automatic Analysis and Synthesis of Fujisaki's Intonation Model for TTS , 2002 .

[8]  Shuichi Narusawa,et al.  Pre-processing of fundamental frequency contours of speech for automatic parameter extraction , 2000, WCC 2000 - ICSP 2000. 2000 5th International Conference on Signal Processing Proceedings. 16th World Computer Congress 2000.

[9]  Inma Hernáez,et al.  Basque intonation modelling for text to speech conversion , 2002, INTERSPEECH.

[10]  Hansjörg Mixdorff,et al.  A novel approach to the fully automatic extraction of Fujisaki model parameters , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).