Mixed feelings: expression of non-basic emotions in a muscle-based talking head

We present an algorithm for generating facial expressions for a continuum of pure and mixed emotions of varying intensity. Based on the observation that in natural interaction among humans, shades of emotion are much more frequently encountered than expressions of basic emotions, a method to generate more than Ekman’s six basic emotions (joy, anger, fear, sadness, disgust and surprise) is required. To this end, we have adapted the algorithm proposed by Tsapatsoulis et al. [1] to be applicable to a physics-based facial animation system and a single, integrated emotion model. A physics-based facial animation system was combined with an equally flexible and expressive text-to-speech synthesis system, based upon the same emotion model, to form a talking head capable of expressing non-basic emotions of varying intensities. With a variety of life-like intermediate facial expressions captured as snapshots from the system we demonstrate the appropriateness of our approach.

[1]  Christoph Bregler,et al.  Video Rewrite: Driving Visual Speech with Audio , 1997, SIGGRAPH.

[2]  Brian Wyvill,et al.  Speech and expression: a computer solution to face animation , 1986 .

[3]  K. Scherer,et al.  Acoustic profiles in vocal emotion expression. , 1996, Journal of personality and social psychology.

[4]  Marc Schröder,et al.  The German Text-to-Speech Synthesis System MARY: A Tool for Research, Development and Teaching , 2003, Int. J. Speech Technol..

[5]  Roddy Cowie,et al.  Acoustic correlates of emotion dimensions in view of speech synthesis , 2001, INTERSPEECH.

[6]  Laila Dybkjær,et al.  Affective Dialogue Systems , 2004, Lecture Notes in Computer Science.

[7]  Roddy Cowie,et al.  Emotional speech: Towards a new generation of databases , 2003, Speech Commun..

[8]  Hans-Peter Seidel,et al.  Head shop: generating animated head models with anatomical structure , 2002, SCA '02.

[9]  Mark Steedman,et al.  Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents , 1994, SIGGRAPH.

[10]  CowieRoddy,et al.  Describing the emotional states that are expressed in speech , 2003 .

[11]  Matthew Brand,et al.  Voice puppetry , 1999, SIGGRAPH.

[12]  Luc J. Van Gool,et al.  A Visual Speech Generator , 2003, IS&T/SPIE Electronic Imaging.

[13]  Paul J. W. ten Hagen,et al.  Emotion Disc and Emotion Squares: Tools to Explore the Facial Expression Space , 2003, Comput. Graph. Forum.

[14]  Horace Ho-Shing Ip,et al.  Script-based facial gesture and speech animation using a NURBS based face model , 1996, Comput. Graph..

[15]  Hans-Peter Seidel,et al.  "May I talk to you? : -) " - facial animation from text , 2002, 10th Pacific Conference on Computer Graphics and Applications, 2002. Proceedings..

[16]  Irene Albrecht,et al.  Automatic Generation of Non-Verbal Facial Expressions from Speech , 2002 .

[17]  Frederic I. Parke,et al.  A parametric model for human faces. , 1974 .

[18]  P. Ekman,et al.  The Repertoire of Nonverbal Behavior: Categories, Origins, Usage, and Coding , 1969 .

[19]  Nick Campbell,et al.  Optimising selection of units from speech databases for concatenative synthesis , 1995, EUROSPEECH.

[20]  Marc Schröder,et al.  XML representation languages as a way of interconnecting TTS modules , 2004, INTERSPEECH.

[21]  Shrikanth Narayanan,et al.  Limited domain synthesis of expressive military speech for animated characters , 2002, Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002..

[22]  Hans-Peter Seidel,et al.  Geometry-based Muscle Modeling for Facial Animation , 2001, Graphics Interface.

[23]  N. Badler,et al.  Linguistic Issues in Facial Animation , 1991 .

[24]  Roddy Cowie,et al.  FEELTRACE: an instrument for recording perceived emotion in real time , 2000 .

[25]  Roddy Cowie,et al.  Emotion Recognition and Synthesis Based on MPEG‐4 FAPs , 2002 .

[26]  Dirk Heylen,et al.  Combination of facial movements on a 3D talking head , 2004 .

[27]  Cynthia Whissell,et al.  THE DICTIONARY OF AFFECT IN LANGUAGE , 1989 .

[28]  Brigitte Krenn,et al.  Generation of multimodal dialogue for net environments , 2002 .

[29]  Tomaso Poggio,et al.  Trainable Videorealistic Speech Animation , 2004, FGR.

[30]  Daniel Thalmann,et al.  SMILE: A Multilayered Facial Animation System , 1991, Modeling in Computer Graphics.

[31]  Petra Wagner,et al.  Speech synthesis development made easy: the bonn open synthesis system , 2001, INTERSPEECH.

[32]  Roddy Cowie,et al.  What a neural net needs to know about emotion words , 1999 .

[33]  Demetri Terzopoulos,et al.  Realistic modeling for facial animation , 1995, SIGGRAPH.

[34]  Marc Schröder,et al.  Emotional speech synthesis: a review , 2001, INTERSPEECH.

[35]  David B. Pisoni,et al.  Text-to-speech: the mitalk system , 1987 .

[36]  Marc Schröder,et al.  Dimensional Emotion Representation as a Basis for Speech Synthesis with Non-extreme Emotions , 2004, ADS.

[37]  Mark Steedman,et al.  Generating Facial Expressions for Speech , 1996, Cogn. Sci..

[38]  Norman I. Badler,et al.  Eyes alive , 2002, ACM Trans. Graph..

[39]  Roddy Cowie,et al.  Describing the emotional states that are expressed in speech , 2003, Speech Commun..

[40]  R. Plutchik Emotion, a psychoevolutionary synthesis , 1980 .

[41]  Norman I. Badler,et al.  FacEMOTE: qualitative parametric modifiers for facial animations , 2002, SCA '02.

[42]  Marc Schröder,et al.  Expressing vocal effort in concatenative synthesis , 2003 .

[43]  K. Scherer Psychological models of emotion. , 2000 .

[44]  Michael M. Cohen,et al.  Modeling Coarticulation in Synthetic Visual Speech , 1993 .

[45]  Thierry Dutoit,et al.  The MBROLA project: towards a set of high quality speech synthesizers free of use for non commercial purposes , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.