Single-Image Facial Expression Recognition Using Deep 3D Re-Centralization

Facial expression recognition (FER) aims to encode expression information from faces. Previous studies often hold the assumption that human subjects should properly face the camera. Such a laboratory-controlled condition, however, is too rigid for in-wide applications. To tackle this issue, we propose a single image facial expression recognition method that is robust to face orientation and light conditions. We achieved this by proposing a novel face re-centralization method by reconstructing a 3D face model from a single image. We then propose a novel end-to-end deep neural network that utilizes both re-centralized 3D model and landmarks for FER task. A comprehensive evaluation on three real-world datasets illustrates that the proposed model outperforms the state-of-the-art techniques in both large-scale and small-scale datasets. The superiority of our model on effectiveness and robustness is also demonstrated in both laboratory conditions and wild images.

[1]  Richard Szeliski,et al.  Computer Vision - Algorithms and Applications , 2011, Texts in Computer Science.

[2]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Àgata Lapedriza,et al.  Emotion Recognition in Context , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[5]  Xiangyu Zhu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Shuicheng Yan,et al.  Peak-Piloted Deep Network for Facial Expression Recognition , 2016, ECCV.

[7]  Tong Zhang,et al.  Multi-cue fusion for emotion recognition in the wild , 2018, Neurocomputing.

[8]  Aseem Agarwala,et al.  A Compact Embedding for Facial Expression Similarity , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Xi Zhou,et al.  Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network , 2018, ECCV.

[10]  Daijin Kim,et al.  Facial Expression Recognition , 2009 .

[11]  Qingming Huang,et al.  Dependency Exploitation: A Unified CNN-RNN Approach for Visual Emotion Recognition , 2017, IJCAI.

[12]  Björn W. Schuller,et al.  Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional LSTM modeling , 2010, INTERSPEECH.

[13]  Changsheng Xu,et al.  Joint Pose and Expression Modeling for Facial Expression Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Zechao Li,et al.  Facial Emotion Distribution Learning by Exploiting Low-Rank Label Correlations Locally , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Cha Zhang,et al.  Image based Static Facial Expression Recognition with Multiple Deep Network Learning , 2015, ICMI.

[18]  Yong Du,et al.  Facial Expression Recognition Based on Deep Evolutional Spatial-Temporal Networks , 2017, IEEE Transactions on Image Processing.

[19]  J. Gower Generalized procrustes analysis , 1975 .

[20]  Matti Pietikäinen,et al.  Facial expression recognition from near-infrared videos , 2011, Image Vis. Comput..

[21]  Lijun Yin,et al.  Facial Expression Recognition by De-expression Residue Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Qiang Ji,et al.  Capturing Complex Spatio-temporal Relations among Facial Muscles for Facial Expression Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Shiguang Shan,et al.  Facial Expression Recognition with Inconsistently Annotated Datasets , 2018, ECCV.

[25]  Mohammad H. Mahoor,et al.  Going deeper in facial expression recognition using deep neural networks , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[26]  Ronen Basri,et al.  Lambertian Reflectance and Linear Subspaces , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Junping Du,et al.  Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Yang Li,et al.  3D model-based continuous emotion recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Junmo Kim,et al.  Joint Fine-Tuning in Deep Neural Networks for Facial Expression Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[30]  Aleix M. Martínez,et al.  EmotioNet: An Accurate, Real-Time Algorithm for the Automatic Annotation of a Million Facial Expressions in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Rainer Lienhart,et al.  Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object Detection , 2003, DAGM-Symposium.