Feature Fusion Algorithm for Multimodal Emotion Recognition from Speech and Facial Expression Signal

In order to overcome the limitation of single mode emotion recognition. This paper describes a novel multimodal emotion recognition algorithm, and takes speech signal and facial expression signal as the research subjects. First, fuse the speech signal feature and facial expression signal feature, get sample sets by putting back sampling, and then get classifiers by BP neural network (BPNN). Second, measure the difference between two classifiers by double error difference selection strategy. Finally, get the final recognition result by the majority voting rule. Experiments show the method improves the accuracy of emotion recognition by giving full play to the advantages of decision level fusion and feature level fusion, and makes the whole fusion process close to human emotion recognition more, with a recognition rate 90.4%.

[1]  Liang Lu A Survey of Human Face Detection , 2002 .

[2]  Leontios J. Hadjileontiadis,et al.  Emotion Recognition From EEG Using Higher Order Crossings , 2010, IEEE Transactions on Information Technology in Biomedicine.

[3]  Elisabeth André,et al.  Emotion recognition based on physiological changes in music listening , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Zhang Huiling Application of ACO algorithm to emotion recognition research based on RSP signal , 2011 .

[5]  Zhigang Deng,et al.  Analysis of emotion recognition using facial expressions, speech and multimodal information , 2004, ICMI '04.

[6]  Yongzhao Zhan,et al.  Learning Salient Features for Speech Emotion Recognition Using Convolutional Neural Networks , 2014, IEEE Transactions on Multimedia.

[7]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[8]  Wenming Zheng,et al.  Multi-View Facial Expression Recognition Based on Group Sparse Reduced-Rank Regression , 2014, IEEE Transactions on Affective Computing.

[9]  Lijiang Chen,et al.  Speech Emotion Recognition Based on Parametric Filter and Fractal Dimension , 2010, IEICE Trans. Inf. Syst..

[10]  Mohamed S. Kamel,et al.  Audio-visual feature-decision level fusion for spontaneous emotion estimation in speech conversations , 2013, 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[11]  Gerhard Rigoll,et al.  Bimodal fusion of emotional data in an automotive environment , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[12]  Yiorgos Chrysanthou,et al.  The Next Big Thing Automatic Emotion Recognition Based on Body Movement Analysis A Survey , 2014 .

[13]  Raphael C.-W. Phan,et al.  Facial Expression Recognition in the Encrypted Domain Based on Local Fisher Discriminant Analysis , 2013, IEEE Transactions on Affective Computing.

[14]  Chen Ming-yi Study on emotion feature analysis and recognition in speech signal: an overview , 2007 .

[15]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2009, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Pierre Dumouchel,et al.  Anchor Models for Emotion Recognition from Speech , 2013, IEEE Transactions on Affective Computing.

[17]  Wenming Zheng,et al.  A Novel Speech Emotion Recognition Method via Incomplete Sparse Least Square Regression , 2014, IEEE Signal Processing Letters.