Emotional speech synthesis: a review

Attempts to add emotion effects to synthesised speech have existed for more than a decade now. Several prototypes and fully operational systems have been built based on different synthesis techniques, and quite a number of smaller studies have been conducted. This paper aims to give an overview of what has been done in this field, pointing out the inherent properties of the various synthesis techniques used, summarising the prosody rules employed, and taking a look at the evaluation paradigms. Finally, an attempt is made to discuss interesting directions for future development.

[1]  Barbara Heuft,et al.  Emotions in time domain synthesis , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[2]  Roddy Cowie,et al.  FEELTRACE: an instrument for recording perceived emotion in real time , 2000 .

[3]  Erhard Rank,et al.  Generating emotional speech with a concatenative synthesizer , 1998, ICSLP.

[4]  A. Tickle,et al.  ENGLISH AND JAPANESE SPEAKERS ’ EMOTION VOCALISATION AND RECOGNITION : A COMPARISON HIGHLIGHTING VOWEL QUALITY , 2000 .

[5]  Hazim Kemal Ekenel,et al.  Role of Intonation Patterns in Conveying Emotion In Speech , 2003 .

[6]  W. Sendlmeier,et al.  Verification of acoustical correlates of emotional speech using formant-synthesis , 2000 .

[7]  Janet E. Cahn Generating expression in synthesized speech , 1989 .

[8]  Kim E. A. Silverman,et al.  Vocal cues to speaker affect: testing two models , 1984 .

[9]  Jean Vroomen,et al.  Duration and intonation in emotional speech , 1993, EUROSPEECH.

[10]  K. Scherer,et al.  Acoustic profiles in vocal emotion expression. , 1996, Journal of personality and social psychology.

[11]  J. Montero,et al.  ANALYSIS AND MODELLING OF EMOTIONAL SPEECH IN SPANISH , 1999 .

[12]  Roddy Cowie,et al.  Describing the emotional states that are expressed in speech , 2003, Speech Commun..

[13]  Soo-Jin Chung,et al.  VOCAL EXPRESSION AND PERCEPTION OF EMOTION IN KOREAN , 1999 .

[14]  Joseph Bates,et al.  The role of emotion in believable agents , 1994, CACM.

[15]  M. Schröder CAN EMOTIONS BE SYNTHESIZED WITHOUT CONTROLLING VOICE QUALITY , 1999 .

[16]  Iain R. Murray,et al.  RULE-BASED EMOTION SYNTHESIS USING CONCATENATED SPEECH , 2000 .

[17]  J. Bachorowski Vocal Expression and Perception of Emotion , 1999 .

[18]  Mike Edgington,et al.  Investigating the limitations of concatenative synthesis , 1997, EUROSPEECH.

[19]  Sjl Mozziconacci Speech variability and emotion : production and perception , 1998 .

[20]  John L. Arnott,et al.  Implementation and testing of a system for producing emotion-by-rule in synthetic speech , 1995, Speech Commun..

[21]  I. Iriondo,et al.  VALIDATION OF AN ACOUSTICAL MODELLING OF EMOTIONAL EXPRESSION IN SPANISH USING SPEECH SYNTHESIS TECHNIQUES , 2000 .

[22]  Juan Manuel Montero-Martínez,et al.  Emotional speech synthesis: from speech database to TTS , 1998, ICSLP.

[23]  Hideki Kasuya,et al.  JOINT ESTIMATION OF VOICE SOURCE AND VOCAL TRACT PARAMETERS AS APPLIED TO THE STUDY OF VOICE SOURCE DYNAMICS , 1999 .