Communicative prosody generation using language common features provided by input lexicons

We already examined language independent control characteristics of the communicative prosody generation using multi-dimensional impressions of input lexicons. In this paper, we synthesized English single phrase utterances using prosodic characteristics of Japanese speech aiming at language independent applications. The reading-style speech prosodies of English phrases were modified by prosodic characteristics derived from one-word utterance of Japanese speech “n”. Modifications were carried out based on lexical impressions corresponding to six impressions consisting of confident, doubtful, allowable, unacceptable, positive and negative. The perceptual evaluation experiment showed the naturalness of speech with communicative prosody modified by the impression of input lexicons. These experimental results support the usefulness of the communicative prosody control based on the impression of input lexicons and suggest possibilities of language independent applications.

[1]  Keikichi Hirose,et al.  Analysis of voice fundamental frequency contours for declarative sentences of Japanese , 1984 .

[2]  Yoshinori Sagisaka,et al.  A trial of communicative prosody generation based on control characteristic of one word utterance observed in real conversational speech , 2006 .

[3]  Yoshinori Sagisaka,et al.  F0 control characterization by perceptual impressions on speaking attitudes using multiple dimensional scaling analysis , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[4]  Nick Campbell,et al.  On the prosody control characteristics of nonverbal utterances and its application to communicative prosody generation , 2006 .

[5]  Yoshinori Sagisaka,et al.  Generation and perception of F0 markedness for communicative speech synthesis , 2005, Speech Commun..

[6]  Nick Campbell,et al.  What do People Hear? A Study of the Perception of Non-verbal Affective Information in Conversational Speech( Emotion in Speech) , 2004 .

[7]  Yoshinori Sagisaka,et al.  Analysis on paralinguistic prosody control in perceptual impression space using multiple dimensional scaling , 2009, Speech Commun..

[8]  Hideki Kawahara,et al.  Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..

[9]  Ke Li,et al.  Inter-language prosodic style modification experiment using word impression vector for communicative speech generation , 2007, INTERSPEECH.

[10]  Yoshinori Sagisaka,et al.  Communicative speech synthesis using constituent word attributes , 2005, INTERSPEECH.