Multimodal Facial Expression Recognition Based on Dempster-Shafer Theory Fusion Strategy

In this paper, we present our facial expression recognition framework for the Multimodal Emotion Recognition Challenge 2017. The task of the challenge is to recognize one of eight facial emotions in short video clips extracted from Chinese films, TV plays and talk shows. Considering the fact that each video is given only one video-level label, while each frame of the video is unlabeled, we improved the unsupervised auto-coder and used it for feature extraction. In addition, we have extracted other features, including traditional hand-craft features and deep learning features. Finally, the random forest algorithm was used to classify different emotional labels, and followed by a decision level fusion method based on Dempster-Shafer evidence theory. The macro average precision(MAP) of our best submission result achieves 59.68% on the testing set, which significantly outperforms the baseline of 30.63%.

[1]  A. Rogier [Communication without words]. , 1971, Tijdschrift voor ziekenverpleging.

[2]  Shiguang Shan,et al.  MEC 2016: The Multimodal Emotion Recognition Challenge of CCPR 2016 , 2016, CCPR.

[3]  Dongmei Jiang,et al.  Audio Visual Recognition of Spontaneous Emotions In-the-Wild , 2016, CCPR.

[4]  Björn W. Schuller,et al.  Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.

[5]  Michel F. Valstar,et al.  Local Gabor Binary Patterns from Three Orthogonal Planes for Automatic Facial Expression Recognition , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[6]  Miriam Kunz,et al.  Understanding Facial Expressions of Pain in Patients With Depression. , 2017, The journal of pain : official journal of the American Pain Society.

[7]  Ying Chen,et al.  Combining feature-level and decision-level fusion in a hierarchical classifier for emotion recognition in the wild , 2015, Journal on Multimodal User Interfaces.

[8]  Christopher Joseph Pal,et al.  EmoNets: Multimodal deep learning approaches for emotion recognition in video , 2015, Journal on Multimodal User Interfaces.

[9]  Mohammad H. Mahoor,et al.  Nonverbal social withdrawal in depression: Evidence from manual and automatic analyses , 2014, Image Vis. Comput..

[10]  Brandon G. King,et al.  Facial Features for Affective State Detection in Learning Environments , 2007 .

[11]  Mohammad Rahmati,et al.  Driver drowsiness detection using face expression recognition , 2011, 2011 IEEE International Conference on Signal and Image Processing Applications (ICSIPA).

[12]  Bo Sun,et al.  Audio-Video Based Multimodal Emotion Recognition Using SVMs and Deep Learning , 2016, CCPR.

[13]  Shimon Whiteson,et al.  Towards Personalised Gaming via Facial Expression Recognition , 2014, AIIDE.

[14]  Qin Jin,et al.  Emotion Recognition in Videos via Fusing Multimodal Features , 2016, CCPR.

[15]  Quan-You Zhao,et al.  Facial expression recognition based on fusion of Gabor and LBP features , 2008, 2008 International Conference on Wavelet Analysis and Pattern Recognition.

[16]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[17]  Richard Bowden,et al.  Local binary patterns for multi-view facial expression recognition , 2011 .

[18]  Michael J. Lyons,et al.  Coding facial expressions with Gabor wavelets , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[19]  Arthur P. Dempster,et al.  Upper and Lower Probabilities Induced by a Multivalued Mapping , 1967, Classic Works of the Dempster-Shafer Theory of Belief Functions.

[20]  Aliaa A. A. Youssif,et al.  Spontaneous Facial Expression Recognition Based on Histogram of Oriented Gradients Descriptor , 2014, Comput. Inf. Sci..