Speaking with smile or disgust: data and models

This paper presents preliminary analysis and modelling of facial motion capture data recorded on a speaker uttering nonsense syllables and sentences with various acted facial expressions. We analyze here the impact of facial expressions on articulation and determine prediction errors of simple models trained to map neutral articulation to the various facial expressions targeted. We show that movement of some speech organs such as the jaw and lower lip are relatively unaffected by the facial expressions considered here (smile, disgust) while others such as the movement of the upper lip or the jaw translation are quite perturbed. We also show that these perturbations are not simply additive, and that they depend on

[1]  V C Tartter,et al.  Hearing smiles and frowns in normal and whisper registers. , 1994, The Journal of the Acoustical Society of America.

[2]  Gérard Bailly,et al.  MOTHER: a new generation of talking heads providing a flexible articulatory control for video-realistic speech animation , 2000, INTERSPEECH.

[3]  Gérard Bailly,et al.  Three-dimensional linear articulatory modeling of tongue, lips and face, based on MRI and video images , 2002, J. Phonetics.

[4]  Arvid Kappas,et al.  Angle of regard: The effect of vertical viewing angle on the perception of facial expressions , 1994 .

[5]  P. Ekman,et al.  Facial Expressions of Emotion , 1979 .

[6]  Jean Vroomen,et al.  Facial expression : Do parts play a full role? , 1996 .

[7]  Maja Pantic,et al.  Automatic Analysis of Facial Expressions: The State of the Art , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Béatrice de Gelder,et al.  Cross-modal Bias of Voice Tone on Facial Expression: Upper versus Lower Halves of a Face , 1998, AVSP.

[9]  Gérard Bailly,et al.  Degrees of freedom of facial movements in face-to-face conversational speech , 2006 .

[10]  Zhigang Deng,et al.  Eurographics/ Acm Siggraph Symposium on Computer Animation (2006) Efase: Expressive Facial Animation Synthesis and Editing with Phoneme-isomap Controls , 2022 .

[11]  Mikko Sams,et al.  Identification of synthetic and natural emotional facial expressions , 2003, AVSP.

[12]  Guillaume Gibert,et al.  Capturing data and realistic 3d models for cued speech analysis and audiovisual synthesis , 2005, AVSP.

[13]  Brian Wyvill,et al.  Speech and expression: a computer solution to face animation , 1986 .

[14]  Gavin C. Cawley,et al.  Evaluation of a talking head based on appearance models , 2003, AVSP.

[15]  Catherine Pelachaud,et al.  Subtleties of facial expressions in embodied agents , 2002, Comput. Animat. Virtual Worlds.

[16]  Gérard Bailly,et al.  Shape and appearance models of talking faces for model-based tracking , 2003, 2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443).

[17]  Ioannis A. Kakadiaris,et al.  Three-Dimensional Face Recognition in the Presence of Facial Expressions: An Annotated Deformable Model Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Jeffrey F. Cohn,et al.  Dynamics of facial expression: normative characteristics and individual differences , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[19]  Véronique Aubergé,et al.  Can we hear the prosody of smile? , 2003, Speech Commun..