DeepEmo: Real-world facial expression analysis via deep learning

Recent automatic facial expression recognition research has focused on optimizing performance on a few databases that were collected under controlled pose and lighting conditions, and has produced nearly perfect accuracy. This paper explores the necessary characteristics of the training dataset, feature representations and machine learning algorithms for a system that operates reliably in more realistic conditions. A new database, Real-world Affective Face Database (RAF-DB), is presented which contains about 30,000 greatly-diverse facial images from social networks. Crowdsourcing results suggest that real-world expression recognition problem is a typical imbalanced multi-label classification problem, and the balanced, single-label datasets currently used in the literature could potentially lead research into misleading algorithmic solutions. A deep learning architecture, DeepEmo, is proposed to address the real-world challenge of emotion recognition by learning the highlevel feature representations which are highly effective for discriminating realistic facial expressions. Extensive experimental results show that the deep learning method is significantly superior to handcrafted features, and with the near-frontal pose constraint, human-level recognition accuracy is achievable.

[1]  Katherine B. Martin,et al.  Facial Action Coding System , 2015 .

[2]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[3]  Saeed Meshgini,et al.  Facial expression recognition based on Local Binary Patterns , 2016 .

[4]  Shaogang Gong,et al.  Facial expression recognition based on Local Binary Patterns: A comprehensive study , 2009, Image Vis. Comput..

[5]  Chih-Jen Lin,et al.  Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..

[6]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[7]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[8]  Gwen Littlewort,et al.  Toward Practical Smile Detection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Tamás D. Gedeon,et al.  Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[10]  Michael J. Lyons,et al.  Coding facial expressions with Gabor wavelets , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[11]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  Matti Pietikäinen,et al.  A comparative study of texture measures with classification based on featured distributions , 1996, Pattern Recognit..