Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions

We present in this paper a new multimodal corpus of spontaneous collaborative and affective interactions in French: RECOLA, which is being made available to the research community. Participants were recorded in dyads during a video conference while completing a task requiring collaboration. Different multimodal data, i.e., audio, video, ECG and EDA, were recorded continuously and synchronously. In total, 46 participants took part in the test, for which the first 5 minutes of interaction were kept to ease annotation. In addition to these recordings, 6 annotators measured emotion continuously on two dimensions: arousal and valence, as well as social behavior labels on live dimensions. The corpus allowed us to take self-report measures of users during task completion. Methodologies and issues related to affective corpus construction are briefly reviewed in this paper. We further detail how the corpus was constructed, i.e., participants, procedure and task, the multimodal recording setup, the annotation of data and some analysis of the quality of these annotations.

[1]  Roddy Cowie,et al.  Emotional speech: Towards a new generation of databases , 2003, Speech Commun..

[2]  Hatice Gunes,et al.  Continuous Prediction of Spontaneous Affect from Multiple Cues and Modalities in Valence-Arousal Space , 2011, IEEE Transactions on Affective Computing.

[3]  J. Gross,et al.  Emotion elicitation using films , 1995 .

[4]  Hatice Gunes,et al.  Automatic Segmentation of Spontaneous Data using Dimensional Labels from Multiple Coders , 2010 .

[5]  Maja Pantic,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING , 2022 .

[6]  Kostas Karpouzis,et al.  The HUMAINE Database: Addressing the Collection and Annotation of Naturalistic and Induced Emotional Data , 2007, ACII.

[7]  Athanasios Katsamanis,et al.  Computing vocal entrainment: A signal-derived PCA-based quantification scheme with application to affect analysis in married couple interactions , 2014, Comput. Speech Lang..

[8]  Yeong-Gil Shin,et al.  Automatic Extraction of Inferior Alveolar Nerve Canal Using Feature-Enhancing Panoramic Volume Rendering , 2011, IEEE Transactions on Biomedical Engineering.

[9]  M. Yik A circumplex model of affect and its relation to personality : a five-language study , 1999 .

[10]  Fabien Ringeval,et al.  Novel Metrics of Speech Rhythm for the Assessment of Emotion , 2012, INTERSPEECH.

[11]  John J. B. Allen,et al.  The handbook of emotion elicitation and assessment , 2007 .

[12]  Roddy Cowie,et al.  FEELTRACE: an instrument for recording perceived emotion in real time , 2000 .

[13]  Sigal G. Barsade,et al.  Mood and Emotions in Small Groups and Work Teams , 2001 .

[14]  B. Mesquita,et al.  Cultural affordances and emotional experience: socially engaging and disengaging emotions in Japan and the United States. , 2006, Journal of personality and social psychology.

[15]  Shrikanth S. Narayanan,et al.  The Vera am Mittag German audio-visual emotional speech database , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[16]  A. Tversky Intransitivity of preferences. , 1969 .

[17]  Mohammad Soleymani,et al.  A Multimodal Database for Affect Recognition and Implicit Tagging , 2012, IEEE Transactions on Affective Computing.

[18]  J. Henry,et al.  The positive and negative affect schedule (PANAS): construct validity, measurement properties and normative data in a large non-clinical sample. , 2004, The British journal of clinical psychology.

[19]  Marc Cavazza,et al.  Evaluating multimodal affective fusion using physiological signals , 2011, IUI '11.

[20]  Jonghwa Kim,et al.  Bimodal Emotion Recognition using Speech and Physiological Changes , 2007 .

[21]  Jay Hall,et al.  The Effects of a Normative Intervention on Group Decision-Making Performance , 1970 .

[22]  K. Scherer,et al.  The World of Emotions is not Two-Dimensional , 2007, Psychological science.

[23]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[24]  Klaus R. Scherer,et al.  Using Actor Portrayals to Systematically Study Multimodal Emotion Expression: The GEMEP Corpus , 2007, ACII.

[25]  Björn W. Schuller,et al.  Categorical and dimensional affect analysis in continuous input: Current trends and future directions , 2013, Image Vis. Comput..

[26]  Anja S. Göritz,et al.  Web-based mood induction , 2006 .

[27]  Daniel McDuff,et al.  Advancements in Noncontact, Multiparameter Physiological Measurements Using a Webcam , 2011, IEEE Transactions on Biomedical Engineering.

[28]  Daniel Gatica-Perez,et al.  Automatic nonverbal analysis of social interaction in small groups: A review , 2009, Image Vis. Comput..

[29]  G. Bower,et al.  The Influence of Mood on Perceptions of Social Interactions , 1984 .

[30]  Maja Pantic,et al.  Cost-effective solution to synchronised audio-visual data capture using multiple sensors , 2011, Image Vis. Comput..

[31]  Björn W. Schuller,et al.  Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge , 2011, Speech Commun..

[32]  M. Bradley,et al.  Measuring emotion: the Self-Assessment Manikin and the Semantic Differential. , 1994, Journal of behavior therapy and experimental psychiatry.

[33]  Joanne Cleland,et al.  Emotional recognition in autism spectrum conditions from voices and faces , 2013, Autism : the international journal of research and practice.

[34]  P. Ekman Emotion in the human face , 1982 .

[35]  Kristian Kroschel,et al.  Robust Speech Recognition and Understanding , 2007 .

[36]  N. Anderson,et al.  Measuring climate for work group innovation: development and validation of the team climate inventory , 1998 .

[37]  Maja Pantic,et al.  Social signal processing: Survey of an emerging domain , 2009, Image Vis. Comput..

[38]  Mohamed Chetouani,et al.  Interpersonal Synchrony: A Survey of Evaluation Methods across Disciplines , 2012, IEEE Transactions on Affective Computing.

[39]  S. Hart,et al.  Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research , 1988 .

[40]  K. Scherer,et al.  Conscious emotional experience emerges as a function of multilevel, appraisal-driven response synchronization , 2008, Consciousness and Cognition.

[41]  Athanasios Katsamanis,et al.  Tracking continuous emotional trends of participants during affective dyadic interactions using body language and speech information , 2013, Image Vis. Comput..