论文信息 - Multimodal Emotion Recognition

Multimodal Emotion Recognition

Multimodal fusion is the process whereby two or more forms of input are gathered together in order to produce a higher overall classification accuracy than individual unimodal systems. This is a popular technique in emotion recognition. In this study, we attempted to discover how much we could improve upon individual unimodal systems using decision level fusion. To accomplish this, we acquired two emotion classification systems, one that worked on audio input alone and another that worked on visual input, and combined their output using a set of manual rules and a classifier to achieve higher classification accuracy.

Colin Grubb

[1] Valery A. Petrushin,et al. EMOTION IN SPEECH: RECOGNITION AND APPLICATION TO CALL CENTERS , 1999 .

[2] Catherine Pelachaud,et al. From Greta's mind to her face: modelling the dynamics of affective states in a conversational embodied agent , 2003, Int. J. Hum. Comput. Stud..

[3] Zhigang Deng,et al. Analysis of emotion recognition using facial expressions, speech and multimodal information , 2004, ICMI '04.

[4] Nicu Sebe,et al. Authentic Facial Expression Analysis , 2004, FGR.

[5] Shrikanth S. Narayanan,et al. Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[6] Loïc Kessous,et al. Multimodal emotion recognition from expressive faces, body gestures and speech , 2007, AIAI.

[7] Elisabeth André,et al. EmoVoice - A Framework for Online Recognition of Emotions from Voice , 2008, PIT.

[8] Shane F. Cotter. Recognition of occluded facial expressions using a Fusion of Localized Sparse Representation Classifiers , 2011, 2011 Digital Signal Processing and Signal Processing Education Meeting (DSP/SPE).

[9] อนิรุธ สืบสิงห์,et al. Data Mining Practical Machine Learning Tools and Techniques , 2014 .