Fusing dynamic deep learned features and handcrafted features for facial expression recognition

Abstract The automated recognition of facial expressions has been actively researched due to its wide-ranging applications. The recent advances in deep learning have improved the performance facial expression recognition (FER) methods. In this paper, we propose a framework that combines discriminative features learned using convolutional neural networks and handcrafted features that include shape- and appearance-based features to further improve the robustness and accuracy of FER. In addition, texture information is extracted from facial patches to enhance the discriminative power of the extracted textures. By encoding shape, appearance, and deep dynamic information, the proposed framework provides high performance and outperforms state-of-the-art FER methods on the CK+ dataset.

[1]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Takeo Kanade,et al.  Recognizing Action Units for Facial Expression Analysis , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Maja Pantic,et al.  Automatic Analysis of Facial Expressions: The State of the Art , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[5]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[6]  Maja Pantic,et al.  Dynamics of facial expression: recognition of facial actions and their temporal segments from face profile image sequences , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[7]  Maja Pantic,et al.  Automatic Analysis of Facial Actions: A Survey , 2019, IEEE Transactions on Affective Computing.

[8]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[10]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  Mohammad H. Mahoor,et al.  Facial Expression Recognition Using Enhanced Deep 3D Convolutional Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[12]  Subhransu Maji,et al.  Deep filter banks for texture recognition and segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Stefan Wermter,et al.  Face expression recognition with a 2-channel Convolutional Neural Network , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[14]  Peter Robinson,et al.  Constrained Local Neural Fields for Robust Facial Landmark Detection in the Wild , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[15]  Beat Fasel,et al.  Automati Fa ial Expression Analysis: A Survey , 1999 .

[16]  Michael Goh Kah Ong,et al.  Facial Expression Recognition Using a Hybrid CNN-SIFT Aggregator , 2017, MIWAI.

[17]  Emad Barsoum,et al.  Training deep networks for facial expression recognition with crowd-sourced label distribution , 2016, ICMI.

[18]  Albert Ali Salah,et al.  Video-based emotion recognition in the wild using deep transfer learning and score fusion , 2017, Image Vis. Comput..

[19]  Shaogang Gong,et al.  Facial expression recognition based on Local Binary Patterns: A comprehensive study , 2009, Image Vis. Comput..

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Sridha Sridharan,et al.  Deep Spatio-Temporal Features for Multimodal Emotion Recognition , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[22]  Yin Yang,et al.  A discriminative dynamic framework for facial expression recognition in video sequences , 2018, J. Vis. Commun. Image Represent..

[23]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2009, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[25]  Lambert Schomaker,et al.  Multi-script text versus non-text classification of regions in scene images , 2019, J. Vis. Commun. Image Represent..

[26]  Soo-Young Lee,et al.  Hierarchical committee of deep convolutional neural networks for robust facial expression recognition , 2016, Journal on Multimodal User Interfaces.

[27]  Zhengyou Zhang,et al.  Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[28]  Rama Chellappa,et al.  FaceNet2ExpNet: Regularizing a Deep Face Recognition Net for Expression Recognition , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[29]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[30]  M. Taner Eskil,et al.  Facial expression recognition based on anatomy , 2014, Comput. Vis. Image Underst..

[31]  P. Ekman,et al.  Constants across cultures in the face and emotion. , 1971, Journal of personality and social psychology.

[32]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[33]  Qiang Ji,et al.  Active and dynamic information fusion for facial expression understanding from image sequences , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Edmund Y. Lam,et al.  Facial expression recognition using deep neural networks , 2015, 2015 IEEE International Conference on Imaging Systems and Techniques (IST).