E2-Capsule Neural Networks for Facial Expression Recognition Using AU-Aware Attention

Capsule neural network is a new and popular technique in deep learning. However, the traditional capsule neural network does not extract features sufficiently before the dynamic routing between the capsules. In this paper, the one Double Enhanced Capsule Neural Network (E2-Capsnet) that uses AU-aware attention for facial expression recognition (FER) is proposed. The E2-Capsnet takes advantage of dynamic routing between the capsules, and has two enhancement modules which are beneficial for FER. The first enhancement module is the convolutional neural network with AU-aware attention, which can help focus on the active areas of the expression. The second enhancement module is the capsule neural network with multiple convolutional layers, which enhances the ability of the feature representation. Finally, squashing function is used to classify the facial expression. We demonstrate the effectiveness of E2-Capsnet on the two public benchmark datasets, RAF-DB and EmotioNet. The experimental results show that our E2-Capsnet is superior to the state-of-the-art methods. Our implementation will be publicly available online.

[1]  Gwen Littlewort,et al.  Dynamics of Facial Expression Extracted Automatically from Video , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[2]  Hujun Yin,et al.  ApprGAN: appearance-based GAN for facial expression synthesis , 2019, IET Image Process..

[3]  Nicu Sebe,et al.  Facial expression recognition from video sequences: temporal and static modeling , 2003, Comput. Vis. Image Underst..

[4]  Shan Li,et al.  Reliable Crowdsourcing and Deep Locality-Preserving Learning for Unconstrained Facial Expression Recognition , 2019, IEEE Transactions on Image Processing.

[5]  Shan Li,et al.  Real-World Facial Expression Recognition Using Metric Learning Method , 2016, CCBR.

[6]  Lijun Yin,et al.  EAC-Net: Deep Nets with Enhancing and Cropping for Facial Action Unit Detection , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Asit Barman,et al.  Influence of shape and texture features on facial expression recognition , 2019, IET Image Process..

[8]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Stefan Wermter,et al.  Developing crossmodal expression recognition based on a deep neural model , 2016, Adapt. Behav..

[10]  Shan Li,et al.  Boosting-POOF: Boosting Part Based One vs One Feature for Facial Expression Recognition in the Wild , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[11]  Junping Du,et al.  Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Shaogang Gong,et al.  Facial expression recognition based on Local Binary Patterns: A comprehensive study , 2009, Image Vis. Comput..

[13]  Nirmala Madian,et al.  Facial expression recognition techniques: a comprehensive survey , 2019, IET Image Process..

[14]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16]  Takeo Kanade,et al.  Recognizing Action Units for Facial Expression Analysis , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Ioannis Pitas,et al.  Facial Expression Recognition in Image Sequences Using Geometric Deformation Features and Support Vector Machines , 2007, IEEE Transactions on Image Processing.

[18]  Yong Man Ro,et al.  Multi-Objective Based Spatio-Temporal Feature Representation Learning Robust to Expression Intensity Variations for Facial Expression Recognition , 2019, IEEE Transactions on Affective Computing.

[19]  Aggelos K. Katsaggelos,et al.  Automatic facial expression recognition using facial animation parameters and multistream HMMs , 2006, IEEE Transactions on Information Forensics and Security.

[20]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[21]  Qin Zhang,et al.  Gait recognition based on capsule network , 2019, J. Vis. Commun. Image Represent..

[22]  Frédéric Jurie,et al.  An Occam's Razor View on Learning Audiovisual Emotion Recognition with Small Training Sets , 2018, ICMI.

[23]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Ing Ren Tsang,et al.  FERAtt: Facial Expression Recognition With Attention Net , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).