4D Cardiff Conversation Database (4D CCDb): a 4D database of natural, dyadic conversations

The 4D Cardiff Conversation Database (4D CCDb) is the first 4D (3D Video) audio-visual database containing natural conversations between pairs of people. This publicly available database contains 17 conversations which have been fully annotated for speaker and listener activity: conversational facial expressions, head motion, and verbal/non-verbal utterances. It can be accessed at http://www.cs.cf.ac.uk/CCDb. In this paper we describe the data collection and annotation process. We also provide results of a baseline classification experiment distinguishing frontchannel from backchannel smiles, using 3D Active Appearance Models for feature extraction, polynomial fitting for representing the data as 4D sequences, and Support Vector Machines for classification. We believe this expression-rich, audio-visual database of natural conversations will make a useful contribution to the computer vision, affective computing, and cognitive science communities by providing raw data, features, annotations, and baseline comparisons.

[1]  R. Birdwhistell Kinesics and Context: Essays on Body Motion Communication , 1971 .

[2]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[3]  Kristinn R. Thórisson,et al.  The Power of a Nod and a Glance: Envelope Vs. Emotional Feedback in Animated Conversational Agents , 1999, Appl. Artif. Intell..

[4]  Roel Vertegaal Conversational awareness in multiparty VMC , 1997, CHI Extended Abstracts.

[5]  Petra Wagner,et al.  D64: a corpus of richly recorded conversational interaction , 2013, Journal on Multimodal User Interfaces.

[6]  H. Bülthoff,et al.  The contribution of different facial regions to the recognition of conversational expressions. , 2008, Journal of vision.

[7]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[8]  Hao Yan,et al.  More than just a pretty face: conversational protocols and the affordances of embodiment , 2001, Knowl. Based Syst..

[9]  Dirk Heylen,et al.  The MultiLis Corpus - Dealing with Individual Differences in Nonverbal Listening Behavior , 2010, COST 2102 Training School.

[10]  Louis Vuurpijl,et al.  SLOT: A research platform for investigating multimodal communication , 2003, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[11]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[12]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[13]  Gwen Littlewort,et al.  Fully Automatic Facial Action Recognition in Spontaneous Behavior , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[14]  Aphrodite Galata,et al.  Visual Models of Interaction , 1998 .

[15]  Ekaterina P. Volkova,et al.  Emotion categorization of body expressions in narrative scenarios , 2014, Front. Psychol..

[16]  Christian Wallraven,et al.  Cardiff Conversation Database (CCDb): A Database of Natural Dyadic Conversations , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[17]  A. Mehrabian,et al.  Inference of attitudes from nonverbal communication in two channels. , 1967, Journal of consulting psychology.

[18]  Lijun Yin,et al.  Static and dynamic 3D facial expression recognition: A comprehensive survey , 2012, Image Vis. Comput..

[19]  J. Bavelas,et al.  "I show how you feel": Motor mimicry as a communicative act. , 1986 .

[20]  J. Bavelas,et al.  Listeners as co-narrators. , 2000, Journal of personality and social psychology.

[21]  Christian Wallraven,et al.  The Face Speaks: Contextual and Temporal Sensitivity to Backchannel Responses , 2012, ACCV Workshops.

[22]  Maja Pantic,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING , 2022 .

[23]  Peter Wittenburg,et al.  ELAN: a Professional Framework for Multimodality Research , 2006, LREC.

[24]  Louis-Philippe Morency,et al.  Predicting Listener Backchannels: A Probabilistic Multimodal Approach , 2008, IVA.

[25]  H. Bülthoff,et al.  The MPI Facial Expression Database — A Validated Database of Emotional and Conversational Facial Expressions , 2012, PloS one.

[26]  J. Fernández-Dols,et al.  Neutral faces in context: Their emotional meaning and their function , 1994 .

[27]  Alessandro Vinciarelli,et al.  Canal9: A database of political debates for analysis of social interactions , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[28]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[29]  V. Yngve On getting a word in edgewise , 1970 .

[30]  Christian Wallraven,et al.  Workshop on face analysis: The intersection of computer vision and human perception , 2013 .