Communicative F0 generation based on impressions

This paper introduces our research efforts of prosody control for so-called paralinguistic information embedded in communicative speech. To specify the output prosody, we employ three-dimensional expressions extracted from 26 impressions using Multi-Dimensional Scaling. Based on a series of our previous studies showing the correlations between impressions and prosody characteristics, we propose an exact computational scheme to obtain communicative F0 using impressions given by input lexicons and the F0 pattern of corresponding reading style speech. Experimental trials have confirmed the effectiveness of the proposed calculation scheme for a set of expressions consisting of lexicons forming impressions. Finally, further advanced problems are discussed to apply the proposed scheme to other expressions.

[1]  Yoshinori Sagisaka,et al.  Generation and perception of F0 markedness for communicative speech synthesis , 2005, Speech Commun..

[2]  Ke Li,et al.  Inter-language prosodic style modification experiment using word impression vector for communicative speech generation , 2007, INTERSPEECH.

[3]  Yoshinori Sagisaka,et al.  Communicative speech synthesis using constituent word attributes , 2005, INTERSPEECH.

[4]  Keikichi Hirose,et al.  Analysis of voice fundamental frequency contours for declarative sentences of Japanese , 1984 .

[5]  Yoshinori Sagisaka,et al.  A trial of communicative prosody generation based on control characteristic of one word utterance observed in real conversational speech , 2006 .

[6]  Yoshinori Sagisaka,et al.  Communicative prosody generation using language common features provided by input lexicons , 2009, 2009 Eighth International Symposium on Natural Language Processing.

[7]  Yoshinori Sagisaka,et al.  Analysis on paralinguistic prosody control in perceptual impression space using multiple dimensional scaling , 2009, Speech Commun..

[8]  Yoshinori Sagisaka,et al.  F0 control characterization by perceptual impressions on speaking attitudes using multiple dimensional scaling analysis , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[9]  Yoshinori Sagisaka,et al.  Global F0 control parameter prediction based on impressions for communicative prosody generation , 2013, 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE).