论文信息 - Synthesis of prosodic styles

Synthesis of prosodic styles

A text-to-speech system can effectively imitate distinctive speaking styles when a few critical prosodic features are modeled and controlled. This paper demonstrates the methodology with a number of examples, including the ornamental notes and the amplitude profile that define the singing style of Dinah Shore, the phrase curve that sets off the dramatic speaking style of Martin Luther King Jr, and the variations of accent shapes between two American English speakers. The styles are described by Stem-ML tags (soft template mark-up language), which offers the flexibility needed to control accent shapes, phrasal pitch contours, and amplitude profiles, for speech as well as for singing.

Chilin Shih | Greg Kochanski

[1] Robert L. Garretson. Choral Music: History, Style And Performance Practice , 1993 .

[2] Masanobu Abe,et al. Speaking Styles: Statistical Analysis and Synthesis by a Text-to-Speech System , 1997 .

[3] Chilin Shih,et al. Stem-ML: language-independent prosody description , 2000, INTERSPEECH.

[4] J. Sundberg,et al. Musical performance. A synthesis‐by‐rule approach , 1981 .

[5] Janet E. Cahn. Generating pitch accent distributions that show individual and stylistic differences , 1998, SSW.

[6] Yoshinori Kitahara,et al. Prosodic components of speech in the expression of emotions , 1988 .

[7] Kikuo Maekawa. Phonetic and phonological characteristics of paralinguistic information in spoken Japanese , 1998, ICSLP.

[8] Anders Friberg,et al. A Quantitative Rule System for Musical Performance , 1995 .

[9] D. R. Ladd,et al. Manipulating synthetic intonation for speaker characterisation , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[10] Max V. Mathews,et al. Current directions in computer music research , 1989 .

[11] Chilin Shih,et al. Chinese tone modeling with stem-ML , 2000, INTERSPEECH.

[12] Yoshinori Sagisaka,et al. Effect of speaking style on parameters of fundamental frequency contour , 1994, SSW.

[13] Xavier Rodet,et al. Synthesis of the singing voice , 1989 .

[14] Yoshinori Sagisaka,et al. Effect of speaking style on parameters of fundamental frequency contour , 1994 .

[15] Donna Erickson,et al. Articulatory characteristics of emotional utterances in spoken English , 2000, INTERSPEECH.