论文信息 - Control of prosodic focus in corpus-based generation of fundamental frequency contours of Japanese based on the generation process model

Control of prosodic focus in corpus-based generation of fundamental frequency contours of Japanese based on the generation process model

A total corpus-based process of generating prosodic features from text is developed. The process first predicts pauses and phone durations, and then generates F<inf>0</inf> contours. Since F<inf>0</inf> contour generation is based on the generation process model, it is rather easy to manipulate the generated F<inf>0</inf> contours in command level. A method was developed for generating sentence F<inf>0</inf> contours, when a focus is placed in one of the “bunsetsu” of an utterance. The method is to predict differences in the F<inf>0</inf> model commands between with and without focus utterances, and apply them to the F<inf>0</inf> model commands predicted beforehand by the baseline method. The validity of the method was proved by the experiment on F<inf>0</inf> contour generation and speech synthesis.

Keikichi Hirose | Keiko Ochi | Nobuaki Minematsu

[1] Keikichi Hirose,et al. Analysis of voice fundamental frequency contours for declarative sentences of Japanese , 1984 .

[2] Keiichi Tokuda,et al. Hidden Markov models based on multi-space probability distribution for pitch pattern modeling , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[3] Keikichi Hirose,et al. Synthesis of F0 contours using generation process model parameters predicted from unlabeled corpora: application to emotional speech synthesis , 2005, Speech Commun..

[4] Keikichi Hirose,et al. Control of prosodic focus in corpus-based generation of fundamental frequency based on the generation process model , 2008, INTERSPEECH.

[5] Keikichi Hirose,et al. A method for automatic extraction of model parameters from fundamental frequency contours of speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6] Keikichi Hirose,et al. Corpus-based generation of prosodic features from text based on generation process model , 2007, INTERSPEECH.