Behind the Scenes: Towards Reading Beyond Faces for Sparsity-Aware 4D Affect Recognition

In this paper, we present a sparsity-aware deep network for automatic 4D facial expression recognition (FER). Given 4D data, we first propose a novel augmentation method to combat the data limitation problem for deep learning. This is achieved by projecting the input data into RGB and depth map images and then iteratively performing randomized channel concatenation. Encoded in the given 3D landmarks, we also introduce an effective way to capture the facial muscle movements from three orthogonal plans (TOP), the TOP-landmarks over multi-views. Importantly, we then present a sparsity-aware deep network to compute the sparse representations of convolutional features over multi-views. This is not only effective for a higher recognition accuracy but is also computationally convenient. For training, the TOP-landmarks and sparse representations are used to train a long short-term memory (LSTM) network. The refined predictions are achieved when the learned features collaborate over multi-views. Extensive experimental results achieved on the BU-4DFE dataset show the significance of our method over the state-of-the-art methods by reaching a promising accuracy of 99.69% for 4D FER.

[1]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Emmanuel Dellandréa,et al.  Automatic 3D Facial Expression Recognition Based on a Bayesian Belief Net and a Statistical Facial Feature Model , 2010, 2010 20th International Conference on Pattern Recognition.

[3]  P. Ekman,et al.  Constants across cultures in the face and emotion. , 1971, Journal of personality and social psychology.

[4]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[5]  Mike E. Davies,et al.  Iterative Hard Thresholding for Compressed Sensing , 2008, ArXiv.

[6]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2009, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Abd El Rahman Shabayek,et al.  Facial Expression Recognition via Joint Deep Learning of RGB-Depth Map Latent Representations , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[8]  Tong Zhang,et al.  A Deep Neural Network-Driven Feature Learning Method for Multi-view Facial Expression Recognition , 2016, IEEE Transactions on Multimedia.

[9]  Guoying Zhao,et al.  Automatic 4D Facial Expression Recognition via Collaborative Cross-domain Dynamic Image Network , 2019, BMVC.

[10]  Yicong Zhou,et al.  Orthogonalization-Guided Feature Fusion Network for Multimodal 2D+3D Facial Expression Recognition , 2020, IEEE Transactions on Multimedia.

[11]  Jian Sun,et al.  Multimodal 2D+3D Facial Expression Recognition With Deep Fusion Convolutional Neural Network , 2017, IEEE Transactions on Multimedia.

[12]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Hao Zhang,et al.  Expression-insensitive 3D face recognition using sparse representation , 2009, CVPR.

[14]  Ioannis A. Kakadiaris,et al.  3D facial expression recognition: A perspective on promises and challenges , 2011, Face and Gesture 2011.

[15]  Walid Barhoumi,et al.  Sparse coding-based representation of LBP difference for 3D/4D facial expression recognition , 2019, Multimedia Tools and Applications.

[16]  Shaun J. Canavan,et al.  Spontaneous and Non-Spontaneous 3D Facial Expression Recognition Using a Statistical Model with Global and Local Constraints , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[17]  Ioannis A. Kakadiaris,et al.  3D/4D facial expression analysis: An advanced annotated face model approach , 2012, Image Vis. Comput..

[18]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Liming Chen,et al.  Muscular Movement Model-Based Automatic 3D/4D Facial Expression Recognition , 2015, IEEE Transactions on Multimedia.

[20]  Stefanos Zafeiriou,et al.  Recognition of 3D facial expression dynamics , 2012, Image Vis. Comput..

[21]  Yunhong Wang,et al.  Texture and Geometry Scattering Representation-Based Facial Expression Recognition in 2D+3D Videos , 2018, ACM Trans. Multim. Comput. Commun. Appl..

[22]  Ioannis A. Kakadiaris,et al.  4D facial expression recognition , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[23]  Stéphane Mallat,et al.  Invariant Scattering Convolution Networks , 2012, IEEE transactions on pattern analysis and machine intelligence.

[24]  Xi Zhao,et al.  An efficient multimodal 2D + 3D feature-based approach to automatic facial expression recognition , 2015, Comput. Vis. Image Underst..

[25]  Ioannis A. Kakadiaris,et al.  Expressive Maps for 3D Facial Expression Recognition , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[26]  Hassen Drira,et al.  Magnifying Subtle Facial Motions for Effective 4D Expression Recognition , 2019, IEEE Transactions on Affective Computing.

[27]  Anuj Srivastava,et al.  An Intrinsic Framework for Analysis of Facial Surfaces , 2009, International Journal of Computer Vision.

[28]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Shaun J. Canavan,et al.  BP4D-Spontaneous: a high-resolution spontaneous 3D dynamic facial expression database , 2014, Image Vis. Comput..

[30]  Haifeng Hu,et al.  Facial Expression Recognition Using Hierarchical Features With Deep Comprehensive Multipatches Aggregation Convolutional Neural Networks , 2019, IEEE Transactions on Multimedia.

[31]  Shan Li,et al.  Deep Facial Expression Recognition: A Survey , 2018, IEEE Transactions on Affective Computing.

[32]  Xing Zhang,et al.  Nebula feature: A space-time feature for posed and spontaneous 4D facial behavior analysis , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[33]  Michael G. Strintzis,et al.  Bilinear Models for 3-D Face and Facial Expression Recognition , 2008, IEEE Transactions on Information Forensics and Security.

[34]  Liming Chen,et al.  3D facial expression recognition via multiple kernel learning of Multi-Scale Local Normal Patterns , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[35]  Liming Chen,et al.  Automatic 3D facial expression recognition using geometric scattering representation , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[36]  Hassen Drira,et al.  4-D Facial Expression Recognition by Learning Geometric Deformations , 2014, IEEE Transactions on Cybernetics.

[37]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[38]  Beat Fasel,et al.  Automati Fa ial Expression Analysis: A Survey , 1999 .

[39]  Hassan Foroosh,et al.  Sparse Convolutional Neural Networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Lijun Yin,et al.  Tracking Vertex Flow and Model Adaptation for Three-Dimensional Spatiotemporal Face Analysis , 2010, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[41]  Ling Li,et al.  Automatic 4D Facial Expression Recognition Using DCT Features , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[42]  Stefano Berretti,et al.  Shape analysis of local facial patches for 3D facial expression recognition , 2011, Pattern Recognit..

[43]  Yunhong Wang,et al.  Automatic 4D Facial Expression Recognition Using Dynamic Geometrical Image Network , 2018, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[44]  Wenming Zheng,et al.  Multi-View Facial Expression Recognition Based on Group Sparse Reduced-Rank Regression , 2014, IEEE Transactions on Affective Computing.