Unimodal and Multimodal Human Perceptionof Naturalistic Non-Basic Affective Statesduring Human-Computer Interactions

The present study investigated unimodal and multimodal emotion perception by humans, with an eye for applying the findings towards automated affect detection. The focus was on assessing the reliability by which untrained human observers could detect naturalistic expressions of non-basic affective states (boredom, engagement/flow, confusion, frustration, and neutral) from previously recorded videos of learners interacting with a computer tutor. The experiment manipulated three modalities to produce seven conditions: face, speech, context, face+speech, face+context, speech+context, face+speech+context. Agreement between two observers (OO) and between an observer and a learner (LO) were computed and analyzed with mixed-effects logistic regression models. The results indicated that agreement was generally low (kappas ranged from .030 to .183), but, with one exception, was greater than chance. Comparisons of overall agreement (across affective states) between the unimodal and multimodal conditions supported redundancy effects between modalities, but there were superadditive, additive, redundant, and inhibitory effects when affective states were individually considered. There was both convergence and divergence of patterns in the OO and LO data sets; however, LO models yielded lower agreement but higher multimodal effects compared to OO models. Implications of the findings for automated affect detection are discussed.

[1]  J. Russell Forced-choice response format in the study of facial expression , 1993 .

[2]  T. Scherer,et al.  Constraints for emotion specificity in fear and anger: the context counts. , 2001, Psychophysiology.

[3]  L. F. Barrett,et al.  Context Is Routinely Encoded During Emotion Perception , 2010, Psychological science.

[4]  L. F. Barrett Are Emotions Natural Kinds? , 2006, Perspectives on psychological science : a journal of the Association for Psychological Science.

[5]  Diane J. Litman,et al.  Benefits and challenges of real-time uncertainty detection and adaptation in a spoken dialogue computer tutor , 2011, Speech Commun..

[6]  D. Matsumoto,et al.  Cross-Cultural Judgments of Spontaneous Facial Expressions of Emotion , 2009 .

[7]  D. Bates,et al.  Mixed-Effects Models in S and S-PLUS , 2001 .

[8]  Loïc Kessous,et al.  Emotion Recognition through Multiple Modalities: Face, Body Gesture, Speech , 2008, Affect and Emotion in Human-Computer Interaction.

[9]  H. Meeren,et al.  Beyond the face: exploring rapid influences of context on face processing. , 2006, Progress in brain research.

[10]  R. Adolphs Neural systems for recognizing emotion , 2002, Current Opinion in Neurobiology.

[11]  Rafael A. Calvo,et al.  The Impact of System Feedback on Learners' Affective and Physiological States , 2010, Intelligent Tutoring Systems.

[12]  E. Vesterinen,et al.  Affective Computing , 2009, Encyclopedia of Biometrics.

[13]  Arthur C. Graesser,et al.  Emotions and Learning with AutoTutor , 2007, AIED.

[14]  Arthur C. Graesser,et al.  Multimodal semi-automated affect detection from conversational cues, gross body language, and facial features , 2010, User Modeling and User-Adapted Interaction.

[15]  Rafael A. Calvo,et al.  Detecting Naturalistic Expressions of Nonbasic Affect Using Physiological Signals , 2012, IEEE Transactions on Affective Computing.

[16]  Loïc Kessous,et al.  Modeling Naturalistic Affective States Via Facial, Vocal, and Bodily Expressions Recognition , 2007, Artifical Intelligence for Human Computing.

[17]  Kristen A. Lindquist,et al.  Language and the perception of emotion. , 2006, Emotion.

[18]  Arthur C. Graesser,et al.  Emote aloud during learning with AutoTutor: Applying the Facial Action Coding System to cognitive–affective states during learning , 2008 .

[19]  J. Russell,et al.  Facial and vocal expressions of emotion. , 2003, Annual review of psychology.

[20]  Roddy Cowie,et al.  Multimodal databases of everyday emotion: facing up to complexity , 2005, INTERSPEECH.

[21]  Sidney K. D'Mello,et al.  Consistent but modest: a meta-analysis on unimodal and multimodal affect detection accuracies from 30 studies , 2012, ICMI '12.

[22]  Thierry Pun,et al.  Multimodal Emotion Recognition in Response to Videos , 2012, IEEE Transactions on Affective Computing.

[23]  Rafael A. Calvo,et al.  Affect Detection: An Interdisciplinary Review of Models, Methods, and Their Applications , 2010, IEEE Transactions on Affective Computing.

[24]  A. Atkinson,et al.  Neuroscientific Evidence for Simulation and Shared Substrates in Emotion Recognition: Beyond Faces , 2009 .

[25]  Sidney D'Mello,et al.  The half-life of cognitive-affective states during complex learning , 2011, Cognition & emotion.

[26]  J. Coan Emergent Ghosts of the Emotion Machine , 2010 .

[27]  Chung-Hsien Wu,et al.  Emotion Recognition of Affective Speech Based on Multiple Classifiers Using Acoustic-Prosodic Information and Semantic Labels , 2015, IEEE Transactions on Affective Computing.

[28]  Arthur C. Graesser,et al.  AutoTutor: an intelligent tutoring system with mixed-initiative dialogue , 2005, IEEE Transactions on Education.

[29]  Brandon G. King,et al.  Facial Features for Affective State Detection in Learning Environments , 2007 .

[30]  Markus Kächele,et al.  Multiple Classifier Systems for the Classification of Audio-Visual Emotional States , 2011, ACII.

[31]  J. Russell,et al.  Judgments of emotion from spontaneous facial expressions of New Guineans. , 2007, Emotion.

[32]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Arthur C. Graesser,et al.  Language and Discourse Are Powerful Signals of Student Emotions during Tutoring , 2012, IEEE Transactions on Learning Technologies.

[34]  Ran R. Hassin,et al.  Angry, Disgusted, or Afraid? , 2008, Psychological science.

[35]  Beatrice de Gelder,et al.  Real Faces, Real Emotions: Perceiving Facial Expressions in Naturalistic Contexts of Voices, Bodies, and Scenes , 2011 .

[36]  M. Frank,et al.  The forced-choice paradigm and the perception of facial expressions of emotion. , 2001, Journal of personality and social psychology.

[37]  P. Ekman,et al.  Coherence between expressive and experiential systems in emotion , 1994 .

[38]  L. Camras,et al.  Emotional Facial Expressions in Infancy , 2010 .

[39]  Björn W. Schuller,et al.  Recognizing Affect from Linguistic Information in 3D Continuous Space , 2011, IEEE Transactions on Affective Computing.

[40]  Guillaume Chanel,et al.  Emotion Assessment From Physiological Signals for Adaptation of Game Difficulty , 2011, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[41]  A. Damasio,et al.  The neural substrates of cognitive empathy , 2007, Social neuroscience.

[42]  Amy M. Witherspoon,et al.  Detection of Emotions during Learning with AutoTutor , 2006 .

[43]  Janet Metcalfe,et al.  Metacognition of Emotional Face Recognition Then the Participants' Overall Metacognitive Judgments about Their , 2022 .

[44]  C. Spence,et al.  Multisensory Integration: Space, Time and Superadditivity , 2005, Current Biology.

[45]  J. Russell Core affect and the psychological construction of emotion. , 2003, Psychological review.

[46]  Arthur C. Graesser,et al.  Emotions During the Learning of Difficult Material , 2012 .

[47]  Benoit Huet,et al.  Evidence Theory-Based Multimodal Emotion Recognition , 2009, MMM.

[48]  Zahra Khalili,et al.  Emotion recognition system using brain and peripheral signals: Using correlation dimension to improve the results of EEG , 2009, 2009 International Joint Conference on Neural Networks.

[49]  Hatice Gunes,et al.  Continuous Prediction of Spontaneous Affect from Multiple Cues and Modalities in Valence-Arousal Space , 2011, IEEE Transactions on Affective Computing.

[50]  S. D’Mello A selective meta-analysis on the relative incidence of discrete affective states during learning with technology , 2013 .

[51]  L. F. Barrett Variety is the spice of life: A psychological construction approach to understanding variability in emotion , 2009, Cognition & emotion.

[52]  P. Robinson,et al.  Natural Affect Data: Collection and Annotation , 2011 .

[53]  Johannes Wagner,et al.  Exploring Fusion Methods for Multimodal Emotion Recognition with Missing Data , 2011, IEEE Transactions on Affective Computing.

[54]  Scotty D. Craig,et al.  Affect and learning: An exploratory look into the role of affect in learning with AutoTutor , 2004 .

[55]  L. Rothkrantz,et al.  Toward an affect-sensitive multimodal human-computer interaction , 2003, Proc. IEEE.

[56]  Björn W. Schuller,et al.  Context-Sensitive Learning for Enhanced Audiovisual Emotion Classification , 2012, IEEE Transactions on Affective Computing.

[57]  Jessica L. Tracy,et al.  The Automaticity of Emotion Recognition , 2007 .

[58]  Chung-Hsien Wu,et al.  Error Weighted Semi-Coupled Hidden Markov Model for Audio-Visual Emotion Recognition , 2012, IEEE Transactions on Multimedia.

[59]  B. Mesquita,et al.  Placing the face in context: cultural differences in the perception of facial emotion. , 2008, Journal of personality and social psychology.

[60]  J. M. Carroll,et al.  Do facial expressions signal specific emotions? Judging emotion from the face in context. , 1996, Journal of personality and social psychology.

[61]  Maja Pantic,et al.  Meta-Analysis of the First Facial Expression Recognition Challenge , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[62]  N. Ambady,et al.  Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. , 1992 .

[63]  Loïc Kessous,et al.  Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis , 2010, Journal on Multimodal User Interfaces.

[64]  Klaus R. Scherer,et al.  A psycho-ethological approach to social signal processing , 2012, Cognitive Processing.