Lossless Attention in Convolutional Networks for Facial Expression Recognition in the Wild

Unlike the constraint frontal face condition, faces in the wild have various unconstrained interference factors, such as complex illumination, changing perspective and various occlusions. Facial expressions recognition (FER) in the wild is a challenging task and existing methods can't perform well. However, for occluded faces (containing occlusion caused by other objects and self-occlusion caused by head posture changes), the attention mechanism has the ability to focus on the non-occluded regions automatically. In this paper, we propose a Lossless Attention Model (LLAM) for convolutional neural networks (CNN) to extract attention-aware features from faces. Our module avoids decay information in the process of generating attention maps by using the information of the previous layer and not reducing the dimensionality. Sequentially, we adaptively refine the feature responses by fusing the attention map with the feature map. We participate in the seven basic expression classification sub-challenges of FG-2020 Affective Behavior Analysis in-the-wild Challenge. And we validate our method on the Aff-Wild2 datasets released by the Challenge. The total accuracy (Accuracy) and the unweighted mean (F1) of our method on the validation set are 0.49 and 0.38 respectively, and the final result is 0.42 (0.67 F1-Score + 0.33 Accuracy).

[1]  Xiao Xiang Zhu,et al.  Learning to Pay Attention on Spectral Domain: A Spectral Attention Module-Based Convolutional Network for Hyperspectral Image Classification , 2020, IEEE Transactions on Geoscience and Remote Sensing.

[2]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[3]  Anima Majumder,et al.  Automatic Facial Expression Recognition System Using Deep Network-Based Data Fusion , 2018, IEEE Transactions on Cybernetics.

[4]  Radu Tudor Ionescu,et al.  Local Learning With Deep and Handcrafted Features for Facial Expression Recognition , 2018, IEEE Access.

[5]  Jingdong Wang,et al.  Deeply-Learned Part-Aligned Representations for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Yoshua Bengio,et al.  Challenges in Representation Learning: A Report on Three Machine Learning Contests , 2013, ICONIP.

[7]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Michael J. Lyons,et al.  Coding facial expressions with Gabor wavelets , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[10]  Tamás D. Gedeon,et al.  Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[11]  Kate Saenko,et al.  Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering , 2015, ECCV.

[12]  Dimitrios Kollias,et al.  Analysing Affective Behavior in the First ABAW 2020 Competition , 2020, 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020).

[13]  Kewei Tu,et al.  Structured Attentions for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Haifeng Hu,et al.  Facial Expression Recognition Using Hierarchical Features With Deep Comprehensive Multipatches Aggregation Convolutional Neural Networks , 2019, IEEE Transactions on Multimedia.

[15]  Xiaogang Wang,et al.  Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Di Huang,et al.  Discriminative Attention-based Convolutional Neural Network for 3D Facial Expression Recognition , 2019, 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019).

[17]  Luc Van Gool,et al.  Covariance Pooling for Facial Expression Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[18]  Matti Pietikäinen,et al.  Facial expression recognition from near-infrared videos , 2011, Image Vis. Comput..

[19]  Qingshan Liu,et al.  Learning active facial patches for expression analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Shiguang Shan,et al.  Occlusion Aware Facial Expression Recognition Using CNN With Attention Mechanism , 2019, IEEE Transactions on Image Processing.

[21]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Yoshua Bengio,et al.  Challenges in representation learning: A report on three machine learning contests , 2013, Neural Networks.

[23]  Jianfei Yang,et al.  Region Attention Networks for Pose and Occlusion Robust Facial Expression Recognition , 2019, IEEE Transactions on Image Processing.

[24]  Guoying Zhao,et al.  Recognition of Affect in the Wild Using Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[25]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Guoying Zhao,et al.  Aff-Wild: Valence and Arousal ‘In-the-Wild’ Challenge , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[27]  Tao Mei,et al.  Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Shuicheng Yan,et al.  Peak-Piloted Deep Network for Facial Expression Recognition , 2016, ECCV.

[29]  Shiguang Shan,et al.  Multi-Channel Pose-Aware Convolution Neural Networks for Multi-View Facial Expression Recognition , 2018, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[30]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[31]  Mohammad H. Mahoor,et al.  AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild , 2017, IEEE Transactions on Affective Computing.

[32]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[33]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[34]  Mohammad H. Mahoor,et al.  Going deeper in facial expression recognition using deep neural networks , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[35]  Qilong Wang,et al.  ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Haifeng Hu,et al.  Facial Expression Recognition by Inter-Class Relational Learning , 2019, IEEE Access.

[37]  Takeo Kanade,et al.  Evaluation of Gabor-wavelet-based facial action unit recognition in image sequences of increasing complexity , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[38]  Maja Pantic,et al.  Web-based database for facial expression analysis , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[39]  Junmo Kim,et al.  Joint Fine-Tuning in Deep Neural Networks for Facial Expression Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[40]  Yu Qiao,et al.  Frame Attention Networks for Facial Expression Recognition in Videos , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[41]  Mercedes Eugenia Paoletti,et al.  Visual Attention-Driven Hyperspectral Image Classification , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[42]  Guoying Zhao,et al.  Deep Affect Prediction in-the-Wild: Aff-Wild Database and Challenge, Deep Architectures, and Beyond , 2018, International Journal of Computer Vision.

[43]  Stefanos Zafeiriou,et al.  Aff-Wild2: Extending the Aff-Wild Database for Affect Recognition , 2018, ArXiv.

[44]  Dimitrios Kollias,et al.  Expression, Affect, Action Unit Recognition: Aff-Wild2, Multi-Task Learning and ArcFace , 2019, BMVC.