OAENet: Oriented attention ensemble for accurate facial expression recognition

Abstract Facial Expression Recognition (FER) is a challenging yet important research topic owing to its significance with respect to its academic and commercial potentials. In this work, we propose an oriented attention pseudo-siamese network that takes advantage of global and local facial information for high accurate FER. Our network consists of two branches, a maintenance branch that consisted of several convolutional blocks to take advantage of high-level semantic features, and an attention branch that possesses a UNet-like architecture to obtain local highlight information. Specifically, we first input the face image into the maintenance branch. For the attention branch, we calculate the correlation coefficient between a face and its sub-regions. Next, we construct a weighted mask by correlating the facial landmarks and the correlation coefficients. Then, the weighted mask is sent to the attention branch. Finally, the two branches are fused to output the classification results. As such, a direction-dependent attention mechanism is established to remedy the limitation of insufficient utilization of local information. With the help of our attention mechanism, our network not only grabs a global picture but can also concentrate on important local areas. Experiments are carried out on 4 leading facial expression datasets. Our method has achieved a very appealing performance compared to other state-of-the-art methods.

[1]  Ping Liu,et al.  Facial Expression Recognition via a Boosted Deep Belief Network , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Tal Hassner,et al.  Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns , 2015, ICMI.

[3]  Haibin Ling,et al.  Attention guided deep audio-face fusion for efficient speaker naming , 2019, Pattern Recognit..

[4]  Yin Wang,et al.  Stacked U-Nets with Multi-output for Road Extraction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5]  Sébastien Ouellet,et al.  Real-time emotion recognition for gaming using deep convolutional network features , 2014, ArXiv.

[6]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[7]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  Yunlian Sun,et al.  CAN-GAN: Conditioned-attention normalized GAN for face age synthesis , 2020, Pattern Recognit. Lett..

[9]  Tao Mei,et al.  Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Junmo Kim,et al.  Joint Fine-Tuning in Deep Neural Networks for Facial Expression Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Victor O. K. Li,et al.  Multi-Region Ensemble Convolutional Neural Network for Facial Expression Recognition , 2018, ICANN.

[12]  Junping Du,et al.  Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Jingying Chen,et al.  Facial expression recognition boosted by soft label with a diverse ensemble , 2019, Pattern Recognit. Lett..

[14]  F. Xavier Roca,et al.  Age and gender recognition in the wild with deep attention , 2017, Pattern Recognit..

[15]  Tao Mei,et al.  Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Haifeng Hu,et al.  Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition , 2019, Pattern Recognit..

[17]  Qingshan Liu,et al.  Learning active facial patches for expression analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Jean-Philippe Thiran,et al.  ExprADA: Adversarial domain adaptation for facial expression analysis , 2020, Pattern Recognit..

[19]  Mohammad H. Mahoor,et al.  Facial Expression Recognition Using Enhanced Deep 3D Convolutional Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[20]  Shiguang Shan,et al.  Facial Expression Recognition with Inconsistently Annotated Datasets , 2018, ECCV.

[21]  Kewei Tu,et al.  Structured Attentions for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Zhiyuan Li,et al.  Island Loss for Learning Discriminative Features in Facial Expression Recognition , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[23]  Haitao Zhao,et al.  A visual attention based ROI detection method for facial expression recognition , 2018, Neurocomputing.

[24]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[25]  Yong Man Ro,et al.  Multi-Objective Based Spatio-Temporal Feature Representation Learning Robust to Expression Intensity Variations for Facial Expression Recognition , 2019, IEEE Transactions on Affective Computing.

[26]  Shiguang Shan,et al.  Occlusion Aware Facial Expression Recognition Using CNN With Attention Mechanism , 2019, IEEE Transactions on Image Processing.

[27]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[28]  Xiaohui Yuan,et al.  Conditional convolution neural network enhanced random forest for facial expression recognition , 2018, Pattern Recognit..

[29]  Jiayu Dong,et al.  Facial expression recognition based on a multi-task global-local network , 2020, Pattern Recognit. Lett..

[30]  Xiaogang Wang,et al.  Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Jamal Hussain Shah,et al.  Facial expressions classification and false label reduction using LDA and threefold SVM , 2017, Pattern Recognit. Lett..

[32]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[33]  M. Pantic,et al.  Induced Disgust , Happiness and Surprise : an Addition to the MMI Facial Expression Database , 2010 .

[34]  Rama Chellappa,et al.  FaceNet2ExpNet: Regularizing a Deep Face Recognition Net for Expression Recognition , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[35]  Jane You,et al.  Hard negative generation for identity-disentangled facial expression recognition , 2019, Pattern Recognit..

[36]  Mohammad H. Mahoor,et al.  AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild , 2017, IEEE Transactions on Affective Computing.

[37]  Stefan Winkler,et al.  Deep Learning for Emotion Recognition on Small Datasets using Transfer Learning , 2015, ICMI.

[38]  Matthew B. Blaschko,et al.  The Lovasz-Softmax Loss: A Tractable Surrogate for the Optimization of the Intersection-Over-Union Measure in Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Maurice Herlihy,et al.  A methodology for implementing highly concurrent data objects , 1993, TOPL.

[40]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.