A Multimodal Emotion Detection System during Human-Robot Interaction

In this paper, a multimodal user-emotion detection system for social robots is presented. This system is intended to be used during human–robot interaction, and it is integrated as part of the overall interaction system of the robot: the Robotics Dialog System (RDS). Two modes are used to detect emotions: the voice and face expression analysis. In order to analyze the voice of the user, a new component has been developed: Gender and Emotion Voice Analysis (GEVA), which is written using the Chuck language. For emotion detection in facial expressions, the system, Gender and Emotion Facial Analysis (GEFA), has been also developed. This last system integrates two third-party solutions: Sophisticated High-speed Object Recognition Engine (SHORE) and Computer Expression Recognition Toolbox (CERT). Once these new components (GEVA and GEFA) give their results, a decision rule is applied in order to combine the information given by both of them. The result of this rule, the detected emotion, is integrated into the dialog system through communicative acts. Hence, each communicative act gives, among other things, the detected emotion of the user to the RDS so it can adapt its strategy in order to get a greater satisfaction degree during the human–robot dialog. Each of the new components, GEVA and GEFA, can also be used individually. Moreover, they are integrated with the robotic control platform ROS (Robot Operating System). Several experiments with real users were performed to determine the accuracy of each component and to set the final decision rule. The results obtained from applying this decision rule in these experiments show a high success rate in automatic user emotion recognition, improving the results given by the two information channels (audio and visual) separately.

[1]  Ian H. Witten,et al.  WEKA: a machine learning workbench , 1994, Proceedings of ANZIIS '94 - Australian New Zealnd Intelligent Information Systems Conference.

[2]  Valery A. Petrushin,et al.  EMOTION IN SPEECH: RECOGNITION AND APPLICATION TO CALL CENTERS , 1999 .

[3]  Zeynep Inanoglu,et al.  Emotive alert: HMM-based emotion detection in voicemail messages , 2005, IUI '05.

[4]  P. Ekman,et al.  Constants across cultures in the face and emotion. , 1971, Journal of personality and social psychology.

[5]  Michael Wagner,et al.  Head Pose and Movement Analysis as an Indicator of Depression , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[6]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[7]  Roddy Cowie,et al.  FEELTRACE: an instrument for recording perceived emotion in real time , 2000 .

[8]  Wojtek Kowalczyk,et al.  Detecting changing emotions in human speech by machine and humans , 2013, Applied Intelligence.

[9]  Tsutomu Miyasato,et al.  Multimodal human emotion/expression recognition , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[10]  Nicu Sebe,et al.  Affective multimodal human-computer interaction , 2005, ACM Multimedia.

[11]  Candace L. Sidner,et al.  Recognizing engagement in human-robot interaction , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[12]  Christian Küblbeck,et al.  Face detection and tracking in video sequences using the modifiedcensus transformation , 2006, Image Vis. Comput..

[13]  Shaogang Gong,et al.  Facial expression recognition based on Local Binary Patterns: A comprehensive study , 2009, Image Vis. Comput..

[14]  Shrikanth S. Narayanan,et al.  Detecting emotional state of a child in a conversational computer game , 2011, Comput. Speech Lang..

[15]  Ramón López-Cózar,et al.  Influence of contextual information in emotion annotation for spoken dialogue systems , 2008, Speech Commun..

[16]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Geoff Wyvill,et al.  A Smarter Way to Find pitch , 2005, ICMC.

[18]  L. de Silva,et al.  Facial emotion recognition using multi-modal information , 1997, Proceedings of ICICS, 1997 International Conference on Information, Communications and Signal Processing. Theme: Trends in Information Systems Engineering and Wireless Multimedia Communications (Cat..

[19]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[20]  V. Bruce,et al.  Recognizing Faces [and Discussion] , 1983 .

[21]  M. Bradley,et al.  Measuring emotion: Behavior, feeling, and physiology , 2000 .

[22]  Alexander Sorin,et al.  Emotion Detection for Dementia Patients Monitoring , 2013 .

[23]  Qiang Ji,et al.  Facial Action Unit Recognition by Exploiting Their Dynamic and Semantic Relationships , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  K. Scherer,et al.  Emotion recognition from expressions in face, voice, and body: the Multimodal Emotion Recognition Test (MERT). , 2009, Emotion.

[25]  C. Izard Facial expressions and the regulation of emotions. , 1990, Journal of personality and social psychology.

[26]  J. N. Bassili Facial motion in the perception of faces and of emotional expression. , 1978, Journal of experimental psychology. Human perception and performance.

[27]  Amit Konar,et al.  Emotion Recognition From Facial Expressions and Its Control Using Fuzzy Logic , 2009, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[28]  Diane J. Litman,et al.  Predicting Student Emotions in Computer-Human Tutoring Dialogues , 2004, ACL.

[29]  Alexandre Pauchet,et al.  Recognizing Emotions in Short Texts , 2012, ICAART.

[30]  Björn W. Schuller,et al.  Combining frame and turn-level information for robust recognition of emotions within speech , 2007, INTERSPEECH.

[31]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[32]  Björn Schuller,et al.  Emotion recognition in the noise applying large acoustic feature sets , 2006, Speech Prosody 2006.

[33]  Sati McKenzie,et al.  Machine Interpretation of Emotion: Design of a Memory-Based Expert System for Interpreting Facial Expressions in Terms of Signaled Emotions , 1993, Cogn. Sci..

[34]  Zhigang Deng,et al.  Analysis of emotion recognition using facial expressions, speech and multimodal information , 2004, ICMI '04.

[35]  R. Barber,et al.  Maggie: A Robotic Platform for Human-Robot Social Interaction , 2006, 2006 IEEE Conference on Robotics, Automation and Mechatronics.

[36]  Gwen Littlewort,et al.  Real Time Face Detection and Facial Expression Recognition: Development and Applications to Human Computer Interaction. , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[37]  Mingliang Chen,et al.  Building emotional dictionary for sentiment analysis of online news , 2014, World Wide Web.

[38]  Sartra Wongthanavasu,et al.  Speech emotion recognition using Support Vector Machines , 2013, 2013 5th International Conference on Knowledge and Smart Technology (KST).

[39]  Iosr Journals,et al.  Emotion Recognition Based On Audio Speech , 2013 .

[40]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[41]  Garrison W. Cottrell,et al.  Representing Face Images for Emotion Classification , 1996, NIPS.

[42]  Norman I. Badler,et al.  Final Report to Nsf of the Standards for Facial Animation Workshop Final Report to Nsf of the Standards for Facial Animation Workshop , 1994 .

[43]  Robert X. Gao,et al.  Discrete Wavelet Transform , 2011 .

[44]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[45]  Björn W. Schuller,et al.  OpenEAR — Introducing the munich open-source emotion and affect recognition toolkit , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[46]  Changbo Hu,et al.  AAM derived face representations for robust facial action recognition , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[47]  F. Hara,et al.  Facial interaction between animated 3D face robot and human beings , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[48]  A. Mehrabian,et al.  Decoding of inconsistent communications. , 1967, Journal of personality and social psychology.

[49]  Perry R. Cook,et al.  A Meta-Instrument for Interactive, On-the-Fly Machine Learning , 2009, NIME.

[50]  Loïc Kessous,et al.  Emotion Recognition through Multiple Modalities: Face, Body Gesture, Speech , 2008, Affect and Emotion in Human-Computer Interaction.

[51]  Tobias Ruf,et al.  EDIS - Emotion-Driven Interactive Systems , 2013 .

[52]  M. Bradley,et al.  Measuring emotion: the Self-Assessment Manikin and the Semantic Differential. , 1994, Journal of behavior therapy and experimental psychiatry.

[53]  Roddy Cowie,et al.  Changing emotional tone in dialogue and its prosodic correlates , 1999 .

[54]  P. Ekman,et al.  Emotion in the Human Face: Guidelines for Research and an Integration of Findings , 1972 .

[55]  K. Shadan,et al.  Available online: , 2012 .

[56]  Olga Sourina,et al.  Real-Time EEG-Based Human Emotion Recognition and Visualization , 2010, 2010 International Conference on Cyberworlds.

[57]  Nicu Sebe,et al.  Emotion Recognition Based on Joint Visual and Audio Cues , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[58]  R. Plutchik Emotion, a psychoevolutionary synthesis , 1980 .

[59]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[60]  Alex Martin,et al.  Facial Emotion Recognition in Autism Spectrum Disorders: A Review of Behavioral and Neuroimaging Studies , 2010, Neuropsychology Review.

[61]  István Czigler,et al.  Fearful face recognition in schizophrenia: An electrophysiological study , 2013, Schizophrenia Research.

[62]  Yasunari Yoshitomi,et al.  Effect of sensor fusion for recognition of emotional states using voice, face image and thermal image of face , 2000, Proceedings 9th IEEE International Workshop on Robot and Human Interactive Communication. IEEE RO-MAN 2000 (Cat. No.00TH8499).

[63]  Demetri Terzopoulos,et al.  Analysis and Synthesis of Facial Image Sequences Using Physical and Anatomical Models , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[64]  Rosalind W. Picard Affective Computing for HCI , 1999, HCI.

[65]  Zhihua Wang,et al.  A facial expression based continuous emotional state monitoring system with GPU acceleration , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[66]  Stefan Steidl,et al.  Automatic classification of emotion related user states in spontaneous children's speech , 2009 .

[67]  Alex Pentland,et al.  Automatic spoken affect classification and analysis , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[68]  Maja Pantic,et al.  Automatic Analysis of Facial Expressions: The State of the Art , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[69]  A. Mehrabian,et al.  Inference of attitudes from nonverbal communication in two channels. , 1967, Journal of consulting psychology.

[70]  H. S. Wolff,et al.  iRun: Horizontal and Vertical Shape of a Region-Based Graph Compression , 2022, Sensors.

[71]  P. Young,et al.  Emotion and personality , 1963 .

[72]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[73]  Eric Larson,et al.  Real-Time Time-Domain Pitch Tracking Using Wavelets , 2006 .

[74]  Andreas Ernst,et al.  A Modular Framework to Detect and Analyze Faces for Audience Measurement Systems , 2009, GI Jahrestagung.

[75]  David A. van Leeuwen,et al.  Unobtrusive Multimodal Emotion Detection in Adaptive Interfaces: Speech and Facial Expressions , 2007, HCI.

[76]  Binbin Tu,et al.  Bimodal Emotion Recognition Based on Speech Signals and Facial Expression , 2011 .

[77]  Gwen Littlewort,et al.  The motion in emotion — A CERT based approach to the FERA emotion challenge , 2011, Face and Gesture 2011.