Facial Expression Recognition Using Residual Masking Network

Automatic facial expression recognition (FER) has gained much attention due to its applications in human-computer interaction. Among the approaches to improve FER tasks, this paper focuses on deep architecture with the attention mechanism. We propose a novel Masking Idea to boost the performance of CNN in facial expression task. It uses a segmentation network to refine feature maps, enabling the network to focus on relevant information to make correct decisions. In experiments, we combine the ubiquitous Deep Residual Network and Unet-like architecture to produce a Residual Masking Network. The proposed method holds state-of-the-art (SOTA) accuracy on the well-known FER2013 and private VEMO datasets.

[1]  Honglak Lee,et al.  Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units , 2016, ICML.

[2]  S. Demleitner [Communication without words]. , 1997, Pflege aktuell.

[3]  Zhiyuan Li,et al.  Island Loss for Learning Discriminative Features in Facial Expression Recognition , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[4]  P. Ekman,et al.  Constants across cultures in the face and emotion. , 1971, Journal of personality and social psychology.

[5]  Xiaogang Wang,et al.  Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[7]  Soonja Yeom,et al.  Facial expression recognition using a multi-level convolutional neural network , 2018, ICPR 2018.

[8]  Shervin Minaee,et al.  Deep-Emotion: Facial Expression Recognition Using Attentional Convolutional Network , 2019, Sensors.

[9]  Aurobinda Routray,et al.  A real time facial expression classification system using Local Binary Patterns , 2015, 2012 4th International Conference on Intelligent Human Computer Interaction (IHCI).

[10]  Aleix M. Martínez,et al.  EmotioNet: An Accurate, Real-Time Algorithm for the Automatic Annotation of a Million Facial Expressions in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Victor O. K. Li,et al.  Multi-Region Ensemble Convolutional Neural Network for Facial Expression Recognition , 2018, ICANN.

[12]  Soo-Young Lee,et al.  Fusing Aligned and Non-aligned Face Information for Automatic Affect Recognition in the Wild: A Deep Learning Approach , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[13]  Martin Kampel,et al.  Facial Expression Recognition using Convolutional Neural Networks: State of the Art , 2016, ArXiv.

[14]  Michel Valstar,et al.  Advances, Challenges, and Opportunities in Automatic Facial Expression Recognition , 2016 .

[15]  Radu Tudor Ionescu,et al.  Local Learning With Deep and Handcrafted Features for Facial Expression Recognition , 2018, IEEE Access.

[16]  Shan Li,et al.  Deep Facial Expression Recognition: A Survey , 2018, IEEE Transactions on Affective Computing.

[17]  Deepak Ghimire,et al.  Geometric Feature-Based Facial Expression Recognition in Image Sequences Using Multi-Class AdaBoost and Support Vector Machines , 2013, Sensors.

[18]  David Masip,et al.  Supervised Committee of Convolutional Neural Networks in Automated Facial Expression Analysis , 2018, IEEE Transactions on Affective Computing.

[19]  Mohammad H. Mahoor,et al.  AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild , 2017, IEEE Transactions on Affective Computing.

[20]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jing He,et al.  A Review on Automatic Facial Expression Recognition Systems Assisted by Multimodal Sensor Data , 2019, Sensors.

[22]  Emad Barsoum,et al.  Training deep networks for facial expression recognition with crowd-sourced label distribution , 2016, ICMI.

[23]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Michael Goh Kah Ong,et al.  Facial Expression Recognition Using a Hybrid CNN-SIFT Aggregator , 2017, MIWAI.

[25]  ByoungChul Ko,et al.  A Brief Review of Facial Emotion Recognition Based on Visual Information , 2018, Sensors.

[26]  Yichuan Tang,et al.  Deep Learning using Linear Support Vector Machines , 2013, 1306.0239.

[27]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[28]  Stefan Wermter,et al.  Face expression recognition with a 2-channel Convolutional Neural Network , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[29]  D. Powers,et al.  Extended Non-negative Matrix Factorization for Face and Facial Expression Recognition , 2022 .

[30]  In-So Kweon,et al.  BAM: Bottleneck Attention Module , 2018, BMVC.

[31]  Zhigeng Pan,et al.  Facial Expression Recognition with CNN Ensemble , 2016, 2016 International Conference on Cyberworlds (CW).

[32]  Yoshua Bengio,et al.  Challenges in representation learning: A report on three machine learning contests , 2013, Neural Networks.

[33]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Xiaoou Tang,et al.  From Facial Expression Recognition to Interpersonal Relation Prediction , 2016, International Journal of Computer Vision.

[35]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[36]  Wei Li,et al.  DeepUNet: A Deep Fully Convolutional Network for Pixel-Level Sea-Land Segmentation , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[37]  Matti Pietikäinen,et al.  Spontaneous facial micro-expression analysis using Spatiotemporal Completed Local Quantized Patterns , 2016, Neurocomputing.

[38]  Ping Hu,et al.  Learning supervised scoring ensemble for emotion recognition in the wild , 2017, ICMI.

[39]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[40]  Ping Hu,et al.  HoloNet: towards robust emotion recognition in the wild , 2016, ICMI.