论文信息 - Emotional speech synthesis by XML file using interactive genetic algorithms

Emotional speech synthesis by XML file using interactive genetic algorithms

As a technique that can "let computer speak", speech synthesis is drawing more and more attention. Today, much speech synthesis software can synthesize neutral speech naturally and knowingly. However, it is hard to make computers speak with "emotion" as that in our daily life, because of the complexity of emotion model. Interactive Genetic Algorithms which can be acted self-organizingly, adaptively and self-learningly can just resolve the problem of difficulty in modeling emotional speech synthesis. As a result, this paper designs an emotional speech synthesis process, which adjusts the parameters (XML-tags) used to synthesize emotional speech dynamically, using interactive Genetic Algorithms, to optimize the quality of emotional speech. Also, the paper includes an evaluation experiment, which proves the feasibility of the algorithms.

Xufa Wang | Shangfei Wang | Siliang Lv

[1] Nirbhay N. Singh,et al. Facial Expressions of Emotion , 1998 .

[2] Janet E. Cahn. Generating expression in synthesized speech , 1989 .

[3] Shrikanth S. Narayanan,et al. A Statistical Approach for Modeling Prosody Features using POS Tags for Emotional Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[4] Oudeyer Pierre-Yves,et al. The production and recognition of emotions in speech: features and algorithms , 2003 .

[5] John L. Arnott,et al. Applying an analysis of acted vocal emotions to improve the simulation of synthetic speech , 2008, Comput. Speech Lang..

[6] Marc Schröder,et al. Emotional speech synthesis: a review , 2001, INTERSPEECH.

[7] Pierre-Yves Oudeyer,et al. The production and recognition of emotions in speech: features and algorithms , 2003, Int. J. Hum. Comput. Stud..

[8] Dirk Heylen,et al. Generating expressive speech for storytelling applications , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[9] Marc Schröder,et al. Expressing degree of activation in synthetic speech , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[10] Yuji Sato,et al. Voice quality conversion using interactive evolution of prosodic control , 2009, Appl. Soft Comput..

[11] Takao Kobayashi,et al. Speech Synthesis with Various Emotional Expressions and Speaking Styles by Style Interpolation and Morphing , 2005, IEICE Trans. Inf. Syst..

[12] Tsuyoshi Moriyama,et al. Emotional Speech Synthesis using Subspace Constraints in Prosody , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[13] Aijun Li,et al. Prosody conversion from neutral speech to emotional speech , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[14] H. Scheffé. An Analysis of Variance for Paired Comparisons , 1952 .

[15] P. Ekman,et al. Facial Expressions of Emotion , 1979 .