Audio-lingual and Visual-facial Emotion Recognition: Towards a Bi-modal Interaction System

Towards building a multimodal affect recognition system, we have built a facial expression recognition system and a audio-lingual affect recognition system. In this paper, we present and discuss the development and evaluation process of the two subsystems, concerning the recognition of emotions from audio-lingual and visual-facial modalities. Many researchers agree that these modalities are complementary to each other and that the combination of the two can improve the accuracy in affective user models. Therefore in this paper we present a combination of two modes using multi-criteria decision making theories. The resulted system takes advantage of the strengths of each mode and is more accurate in emotion recognition.

[1]  Richard J. Davidson,et al.  Parsing the subcomponents of emotion and disorders of emotion: Perspectives from affective neuroscience. , 2003 .

[2]  Volker Strom,et al.  Visual prosody: facial movements accompanying speech , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[3]  Antonio Camurri,et al.  Recognizing emotion from dance movement: comparison of spectator recognition and automated techniques , 2003, Int. J. Hum. Comput. Stud..

[4]  L. Rothkrantz,et al.  Toward an affect-sensitive multimodal human-computer interaction , 2003, Proc. IEEE.

[5]  Ioanna-Ourania Stathopoulou,et al.  Evaluation of the discrimination power of features extracted from 2-D and 3-D facial images for facial expression analysis , 2005, 2005 13th European Signal Processing Conference.

[6]  Tsutomu Miyasato,et al.  Multimodal human emotion/expression recognition , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[7]  K. Scherer,et al.  Acoustic profiles in vocal emotion expression. , 1996, Journal of personality and social psychology.

[8]  Javier Ortega-Garcia,et al.  Multimodal biometric databases: an overview , 2006 .

[9]  Ioanna-Ourania Stathopoulou,et al.  NEU-FACES: A Neural Network-Based Face Image Analysis System , 2007, ICANNGA.

[10]  A. Damasio Fundamental feelings , 2001, Nature.

[11]  Hatice Gunes,et al.  A Bimodal Face and Body Gesture Database for Automatic Analysis of Human Nonverbal Affective Behavior , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[12]  Alessandro Sacchi,et al.  A Multimodal Database as a Background for Emotional Synthesis, Recognition and Training in E-Learning Systems , 2005, ACII.

[13]  Klaus R. Scherer,et al.  Adding the affective dimension: a new look in speech analysis and synthesis , 1996, ICSLP.

[14]  Maria Virvou,et al.  Affective Student Modeling Based on Microphone and Keyboard User Actions , 2006, Sixth IEEE International Conference on Advanced Learning Technologies (ICALT'06).

[15]  Zhigang Deng,et al.  Analysis of emotion recognition using facial expressions, speech and multimodal information , 2004, ICMI '04.

[16]  Georgios Paliouras,et al.  Web Usage Mining as a Tool for Personalization: A Survey , 2003, User Modeling and User-Adapted Interaction.

[17]  Elisabeth André,et al.  Comparing Feature Sets for Acted and Spontaneous Speech in View of Automatic Emotion Recognition , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[18]  A. Damasio,et al.  Emotion in the perspective of an integrated nervous system 1 Published on the World Wide Web on 27 January 1998. 1 , 1998, Brain Research Reviews.

[19]  Ching-Lai Hwang,et al.  Methods for Multiple Attribute Decision Making , 1981 .

[20]  Maja Pantic,et al.  Automatic Analysis of Facial Expressions: The State of the Art , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Aleix M. Martinez,et al.  The AR face database , 1998 .

[22]  Roddy Cowie,et al.  Multimodal databases of everyday emotion: facing up to complexity , 2005, INTERSPEECH.

[23]  K. Scherer,et al.  Handbook of affective sciences. , 2003 .

[24]  Jennifer Healey,et al.  Toward Machine Emotional Intelligence: Analysis of Affective Physiological State , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  L. de Silva,et al.  Facial emotion recognition using multi-modal information , 1997, Proceedings of ICICS, 1997 International Conference on Information, Communications and Signal Processing. Theme: Trends in Information Systems Engineering and Wireless Multimedia Communications (Cat..

[26]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[27]  Zhiwei Zhu,et al.  Toward a decision-theoretic framework for affect recognition and user assistance , 2006, Int. J. Hum. Comput. Stud..

[28]  Roddy Cowie,et al.  Automatic statistical analysis of the signal and prosodic signs of emotion in speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[29]  Kristina Höök,et al.  Evaluating affective interactions , 2007, Int. J. Hum. Comput. Stud..

[30]  Peter Robinson,et al.  Generalization of a Vision-Based Computational Model of Mind-Reading , 2005, ACII.

[31]  Ioanna-Ourania Stathopoulou,et al.  Facial Expression Classification: Specifying Requirements for an Automated System , 2006, KES.

[32]  G.A. Tsihrintzis,et al.  Detection and expression classification systems for face images (FADECS) , 2005, IEEE Workshop on Signal Processing Systems Design and Implementation, 2005..

[33]  Ioanna-Ourania Stathopoulou,et al.  Comparative Performance Evaluation of Artificial Neural Network-Based vs. Human Facial Expression Classifiers for Facial Expression Recognition , 2008, New Directions in Intelligent Interactive Multimedia.

[34]  Ching-Lai Hwang,et al.  Fuzzy Multiple Attribute Decision Making - Methods and Applications , 1992, Lecture Notes in Economics and Mathematical Systems.

[35]  Ioanna-Ourania Stathopoulou,et al.  An improved neural-network-based face detection and facial expression classification system , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[36]  Ioanna-Ourania Stathopoulou,et al.  Towards Automated Inferencing of Emotional State from Face Images , 2016, ICSOFT.

[37]  Andrea Kleinsmith,et al.  A categorical approach to affective gesture recognition , 2003, Connect. Sci..

[38]  Elsevier Sdol International Journal of Human-Computer Studies , 2009 .

[39]  Sharon Oviatt,et al.  User-centered modeling and evaluation of multimodal interfaces , 2003, Proc. IEEE.

[40]  Maria Virvou,et al.  Combining Empirical Studies of Audio-Lingual and Visual-Facial Modalities for Emotion Recognition , 2007, KES.

[41]  Rosalind W. Picard Affective computing: challenges , 2003, Int. J. Hum. Comput. Stud..