A New Multi-modal Dataset for Human Affect Analysis

In this paper we present a new multi-modal dataset of spontaneous three way human interactions. Participants were recorded in an unconstrained environment at various locations during a sequence of debates in a video conference, Skype style arrangement. An additional depth modality was introduced, which permitted the capture of 3D information in addition to the video and audio signals. The dataset consists of 16 participants and is subdivided into 6 unique sections. The dataset was manually annotated on a continuously scale across 5 different affective dimensions including arousal, valence, agreement, content and interest. The annotation was performed by three human annotators with the ensemble average calculated for use in the dataset. The corpus enables the analysis of human affect during conversations in a real life scenario. We first briefly reviewed the existing affect dataset and the methodologies related to affect dataset construction, then we detailed how our unique dataset was constructed.

[1]  Marc Schröder,et al.  Piecing Together the Emotion Jigsaw , 2004, MLMI.

[2]  M. Bartlett,et al.  Machine Analysis of Facial Expressions , 2007 .

[3]  Björn W. Schuller,et al.  AVEC 2011-The First International Audio/Visual Emotion Challenge , 2011, ACII.

[4]  Peter Robinson,et al.  Natural affect data — Collection & annotation in a learning context , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[5]  Roddy Cowie,et al.  The Essential Role of Human Databases for Learning in and Validation of Affectively Competent Agents. , 2010 .

[6]  Alessandro Vinciarelli,et al.  Canal9: A database of political debates for analysis of social interactions , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[7]  Hatice Gunes,et al.  Dimensional Emotion Prediction from Spontaneous Head Gestures for Interaction with Sensitive Artificial Listeners , 2010, IVA.

[8]  Michael Kipp,et al.  ANVIL - a generic annotation tool for multimodal dialogue , 2001, INTERSPEECH.

[9]  Hennie Brugman,et al.  Annotating Multi-media/Multi-modal Resources with ELAN , 2004, LREC.

[10]  Fabien Ringeval,et al.  Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[11]  Andrea Kleinsmith,et al.  Affective Body Expression Perception and Recognition: A Survey , 2013, IEEE Transactions on Affective Computing.

[12]  Mann Oo. Hay Emotion recognition in human-computer interaction , 2012 .

[13]  Lijun Yin,et al.  Static and dynamic 3D facial expression recognition: A comprehensive survey , 2012, Image Vis. Comput..

[14]  Roddy Cowie,et al.  FEELTRACE: an instrument for recording perceived emotion in real time , 2000 .

[15]  Maja Pantic,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING , 2022 .

[16]  Mohammad Soleymani,et al.  A Multimodal Database for Affect Recognition and Implicit Tagging , 2012, IEEE Transactions on Affective Computing.

[17]  Andrei Popescu-Belis,et al.  Machine Learning for Multimodal Interaction , 4th International Workshop, MLMI 2007, Brno, Czech Republic, June 28-30, 2007, Revised Selected Papers , 2008, MLMI.

[18]  Lijun Yin,et al.  A high-resolution 3D dynamic facial expression database , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[19]  Björn W. Schuller,et al.  Categorical and dimensional affect analysis in continuous input: Current trends and future directions , 2013, Image Vis. Comput..

[20]  Roddy Cowie,et al.  AVEC 2012: the continuous audio/visual emotion challenge - an introduction , 2012, ICMI.

[21]  Peter Robinson,et al.  3D Corpus of Spontaneous Complex Mental States , 2011, ACII.

[22]  Jake K. Aggarwal,et al.  Human activity recognition from 3D data: A review , 2014, Pattern Recognit. Lett..

[23]  Shaun J. Canavan,et al.  BP4D-Spontaneous: a high-resolution spontaneous 3D dynamic facial expression database , 2014, Image Vis. Comput..