Occlusion-Adaptive Deep Network for Robust Facial Expression Recognition

Recognizing the expressions of partially occluded faces is a challenging computer vision problem. Previous expression recognition methods, either overlooked this issue or resolved it using unrealistic assumptions. Motivated by the fact that the human visual system is adept at ignoring the occlusions and focus on non-occluded facial areas, we propose a landmark-guided attention branch to find and discard corrupted features from occluded regions so that they are not used for recognition. An attention map is first generated to indicate if a specific facial part is occluded and guide our model to attend to non-occluded regions. To further improve robustness, we propose a facial region branch to partition the feature maps into non-overlapping facial blocks and task each block to predict the expression independently. This results in more diverse and discriminative features, enabling the expression recognition system to re-cover even though the face is partially occluded. Depending on the synergistic effects of the two branches, our occlusion-adaptive deep network significantly outperforms state-of-the-art methods on two challenging in-the-wild benchmark datasets and three real-world occluded expression datasets.

[1]  Yoshua Bengio,et al.  Challenges in representation learning: A report on three machine learning contests , 2013, Neural Networks.

[2]  Shiguang Shan,et al.  Learning Expressionlets on Spatio-temporal Manifold for Dynamic Facial Expression Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[5]  Zhiyuan Li,et al.  Identity-Free Facial Expression Recognition Using Conditional Generative Adversarial Network , 2019, 2021 IEEE International Conference on Image Processing (ICIP).

[6]  Weihong Deng,et al.  Local Subclass Constraint for Facial Expression Recognition in the Wild , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[7]  Shiguang Shan,et al.  Patch-Gated CNN for Occlusion-aware Facial Expression Recognition , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[8]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Jianfei Yang,et al.  Region Attention Networks for Pose and Occlusion Robust Facial Expression Recognition , 2019, IEEE Transactions on Image Processing.

[10]  M. Pantic,et al.  Induced Disgust , Happiness and Surprise : an Addition to the MMI Facial Expression Database , 2010 .

[11]  Junping Du,et al.  Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Zhiyuan Li,et al.  Island Loss for Learning Discriminative Features in Facial Expression Recognition , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[13]  Shiguang Shan,et al.  AU-aware Deep Networks for facial expression recognition , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[14]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[15]  Georgios Tzimiropoulos,et al.  How Far are We from Solving the 2D & 3D Face Alignment Problem? (and a Dataset of 230,000 3D Facial Landmarks) , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Emad Barsoum,et al.  Training deep networks for facial expression recognition with crowd-sourced label distribution , 2016, ICMI.

[17]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[18]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[19]  Omkar M. Parkhi,et al.  VGGFace2: A Dataset for Recognising Faces across Pose and Age , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[20]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[21]  Ping Liu,et al.  Facial Expression Recognition via a Boosted Deep Belief Network , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Corneliu Florea,et al.  Annealed Label Transfer for Face Expression Recognition , 2019, BMVC.

[23]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[24]  Changsheng Xu,et al.  Joint Pose and Expression Modeling for Facial Expression Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Shiguang Shan,et al.  Facial Expression Recognition with Inconsistently Annotated Datasets , 2018, ECCV.

[26]  Bin Xia,et al.  Occluded Facial Expression Recognition Enhanced through Privileged Information , 2019, ACM Multimedia.

[27]  Mohammad H. Mahoor,et al.  AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild , 2017, IEEE Transactions on Affective Computing.

[28]  Yi Yang,et al.  Style Aggregated Network for Facial Landmark Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Stefanos Zafeiriou,et al.  300 Faces in-the-Wild Challenge: The First Facial Landmark Localization Challenge , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[30]  Luc Van Gool,et al.  Covariance Pooling for Facial Expression Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[31]  Matti Pietikäinen,et al.  Facial expression recognition from near-infrared videos , 2011, Image Vis. Comput..

[32]  Lijun Yin,et al.  Facial Expression Recognition by De-expression Residue Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Abhinav Dhall,et al.  Expression Empowered ResiDen Network for Facial Action Unit Detection , 2018, 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019).

[34]  Jean-Philippe Thiran,et al.  Using Photorealistic Face Synthesis and Domain Adaptation to Improve Facial Expression Analysis , 2019, 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019).

[35]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[36]  Qingshan Liu,et al.  Learning active facial patches for expression analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Shan Li,et al.  Reliable Crowdsourcing and Deep Locality-Preserving Learning for Unconstrained Facial Expression Recognition , 2019, IEEE Transactions on Image Processing.

[38]  Ivor W. Tsang,et al.  Feature Disentangling Machine - A Novel Approach of Feature Selection and Disentangling in Facial Expression Analysis , 2014, ECCV.

[39]  Shiguang Shan,et al.  Occlusion Aware Facial Expression Recognition Using CNN With Attention Mechanism , 2019, IEEE Transactions on Image Processing.