Towards 4D Coupled Models of Conversational Facial Expression Interactions

In this paper we introduce a novel approach for building 4D coupled statistical models of conversational facial expression interactions. To build these coupled models we use 3D AAMs for feature extraction, 4D polynomial fitting for sequence representation, and concatenated feature vectors of frontchannel-backchannel interactions. Using a coupled model of conversation smile interactions, we predicted each sequence’s backchannel signal. In a subsequent experiment, human observers rated predicted backchannel sequences as highly similar to the originals. Our results demonstrate the usefulness of coupled models as powerful tools to analyse and synthesise key aspects of conversational interactions, including conversation timings, backchannel responses to frontchannel signals, and the spatial and temporal dynamics of conversational facial expression interactions. Methodology: Using a 4D database of natural, dyadic conversations [3], conversational interactions were manually annotated for conversational expressions. The sequences were tracked using a 4D sparse tracking approach, which uses 3D shape and texture.These tracked points are used as control points in a dense correspondence method. This method uses a Thin Plate Spline (TPS) based algorithm, with an additional “snapping” step, to modify the geometry of one mesh (reference mesh) so that it matches that of another mesh (target mesh). The tracking and intersubject registration methods were developed in-lab and details for these approaches can be found in [2]. Statistical modelling of these sequences was performed using 3D Active Appearance Models (AAMs) [1] and a polynomial regression technique for sequence representation. Experiments: In Experiment 1, individual sequences were classified as either frontchannel or backchannel. In Experiment 2, these sequences were also modified and used in a perceptual experiment that evaluated the realism of the synthesised sequences. For Experiments 3 and 4, a coupled statistical model of conversation interactions was built by concatenating the frontchannel sequence and corresponding backchannel sequence feature vectors (Table 1).

[1]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[2]  M. Castelán,et al.  Face shape recovery from a single image view , 2006 .

[3]  Brenda G. Cox THE WEIGHTED SEQUENTIAL HOT DECK IMPUTATION PROCEDURE , 2002 .

[4]  V. Yngve On getting a word in edgewise , 1970 .

[5]  Kristinn R. Thórisson,et al.  The Power of a Nod and a Glance: Envelope Vs. Emotional Feedback in Animated Conversational Agents , 1999, Appl. Artif. Intell..

[6]  Catherine Pelachaud,et al.  Performative facial expressions in animated faces , 2001 .

[7]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[8]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[9]  Petra Wagner,et al.  D64: a corpus of richly recorded conversational interaction , 2013, Journal on Multimodal User Interfaces.

[10]  Aphrodite Galata,et al.  Visual Models of Interaction , 1998 .

[11]  Dirk Heylen,et al.  The MultiLis Corpus - Dealing with Individual Differences in Nonverbal Listening Behavior , 2010, COST 2102 Training School.

[12]  Lijun Yin,et al.  Static and dynamic 3D facial expression recognition: A comprehensive survey , 2012, Image Vis. Comput..

[13]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Maja Pantic,et al.  Coupled Gaussian processes for pose-invariant facial expression recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Louis-Philippe Morency,et al.  Predicting Listener Backchannels: A Probabilistic Multimodal Approach , 2008, IVA.

[16]  Shaun J. Canavan,et al.  BP4D-Spontaneous: a high-resolution spontaneous 3D dynamic facial expression database , 2014, Image Vis. Comput..

[17]  Bogdan J. Matuszewski,et al.  Hi4D-ADSIP 3-D dynamic facial articulation database , 2012, Image Vis. Comput..

[18]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[19]  Janet Beavin Bavelas,et al.  The psychology of facial expression: Faces in dialogue , 1997 .

[20]  Donald E. Knuth,et al.  The art of computer programming. Vol.2: Seminumerical algorithms , 1981 .

[21]  L. J. Brunner,et al.  Smiles can be back channels. , 1979 .

[22]  Christian Wallraven,et al.  The Face Speaks: Contextual and Temporal Sensitivity to Backchannel Responses , 2012, ACCV Workshops.

[23]  P. Ekman,et al.  Felt, false, and miserable smiles , 1982 .

[24]  Ekaterina P. Volkova,et al.  Emotion categorization of body expressions in narrative scenarios , 2014, Front. Psychol..

[25]  C. R. Deboor,et al.  A practical guide to splines , 1978 .

[26]  Trevor Hastie,et al.  Imputing Missing Data for Gene Expression Arrays , 2001 .

[27]  John C. Tang,et al.  What video can and cannot do for collaboration: A case study , 2005, Multimedia Systems.

[28]  Hao Yan,et al.  More than just a pretty face: conversational protocols and the affordances of embodiment , 2001, Knowl. Based Syst..

[29]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[30]  Maja Pantic,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING , 2022 .

[31]  Eva Krumhuber,et al.  Moving Smiles: The Role of Dynamic Components for the Perception of the Genuineness of Smiles , 2005 .

[32]  F. W. Kellaway,et al.  Advanced Engineering Mathematics , 1969, The Mathematical Gazette.

[33]  Gustavo E. A. P. A. Batista,et al.  A Study of K-Nearest Neighbour as an Imputation Method , 2002, HIS.

[34]  J. Bavelas,et al.  "I show how you feel": Motor mimicry as a communicative act. , 1986 .

[35]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[36]  Yiying Tong,et al.  FaceWarehouse: A 3D Facial Expression Database for Visual Computing , 2014, IEEE Transactions on Visualization and Computer Graphics.

[37]  Gwen Littlewort,et al.  Fully Automatic Facial Action Recognition in Spontaneous Behavior , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[38]  Paul L. Rosin,et al.  4D Cardiff Conversation Database (4D CCDb): a 4D database of natural, dyadic conversations , 2015, AVSP.

[39]  Timothy F. Cootes,et al.  Model-based methods in analysis of Biomedical images. , 1999 .

[40]  J. Bavelas,et al.  Listeners as co-narrators. , 2000, Journal of personality and social psychology.

[41]  Roel Vertegaal Conversational awareness in multiparty VMC , 1997, CHI Extended Abstracts.

[42]  Trevor Hastie,et al.  Clustering microarray data , 2003 .

[43]  Donald E. Knuth The Art of Computer Programming 2 / Seminumerical Algorithms , 1971 .

[44]  Alessandro Vinciarelli,et al.  Canal9: A database of political debates for analysis of social interactions , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[45]  Guillermo Recio Perception of dynamic facial expressions of emotion , 2013 .