Cross-Domain Facial Expression Recognition: A Unified Evaluation Benchmark and Adversarial Graph Learning

To address the problem of data inconsistencies among different facial expression recognition (FER) datasets, many cross-domain FER methods (CD-FERs) have been extensively devised in recent years. Although each declares to achieve superior performance, fair comparisons are lacking due to the inconsistent choices of the source/target datasets and feature extractors. In this work, we first analyze the performance effect caused by these inconsistent choices, and then re-implement some well-performing CD-FER and recently published domain adaptation algorithms. We ensure that all these algorithms adopt the same source datasets and feature extractors for fair CD-FER evaluations. We find that most of the current leading algorithms use adversarial learning to learn holistic domain-invariant features to mitigate domain shifts. However, these algorithms ignore local features, which are more transferable across different datasets and carry more detailed content for fine-grained adaptation. To address these issues, we integrate graph representation propagation with adversarial learning for cross-domain holistic-local feature co-adaptation by developing a novel adversarial graph representation adaptation (AGRA) framework. Specifically, it first builds two graphs to correlate holistic and local regions within each domain and across different domains, respectively. Then, it extracts holistic-local features from the input image and uses learnable per-class statistical distributions to initialize the corresponding graph nodes. Finally, two stacked graph convolution networks (GCNs) are adopted to propagate holistic-local features within each domain to explore their interaction and across different domains for holistic-local feature co-adaptation. We conduct extensive and fair evaluations on several popular benchmarks and show that the proposed AGRA framework outperforms previous state-of-the-art methods.

[1]  Chloé Clavel,et al.  Fear-type emotion recognition for future audio-based surveillance systems , 2008, Speech Commun..

[2]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Xiaoyan Zhou,et al.  Cross-Domain Color Facial Expression Recognition Using Transductive Transfer Subspace Learning , 2018, IEEE Transactions on Affective Computing.

[4]  Liang Lin,et al.  Physical-Virtual Collaboration Modeling for Intra- and Inter-Station Metro Ridership Prediction , 2020, IEEE Transactions on Intelligent Transportation Systems.

[5]  Yang Yang,et al.  Cross-domain facial expression recognition via an intra-category common feature and inter-category Distinction feature fusion network , 2019, Neurocomputing.

[6]  Junping Du,et al.  Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Nicolas Courty,et al.  Unbalanced minibatch Optimal Transport; applications to Domain Adaptation , 2021, ICML.

[8]  Fang Zhao,et al.  Multi-Prototype Networks for Unconstrained Set-based Face Recognition , 2019, IJCAI.

[9]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[10]  Chuan-Xian Ren,et al.  Enhanced Transport Distance for Unsupervised Domain Adaptation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Takeo Kanade,et al.  Recognizing Action Units for Facial Expression Analysis , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Mohammad H. Mahoor,et al.  Spatio-Temporal Facial Expression Recognition Using Convolutional Neural Networks and Conditional Random Fields , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[13]  Mei Wang,et al.  Deep Visual Domain Adaptation: A Survey , 2018, Neurocomputing.

[14]  Qijun Zhao,et al.  Discriminative Feature Adaptation for cross-domain facial expression recognition , 2016, 2016 International Conference on Biometrics (ICB).

[15]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[16]  Shan Li,et al.  Deep Emotion Transfer Network for Cross-database Facial Expression Recognition , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[17]  Xiangjun Wang,et al.  Unsupervised Domain Adaptation for Facial Expression Recognition Using Generative Adversarial Networks , 2018, Comput. Intell. Neurosci..

[18]  Mohammad H. Mahoor,et al.  Going deeper in facial expression recognition using deep neural networks , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[19]  Mohamed S. Kamel,et al.  Cross-Domain Facial Expression Recognition Using Supervised Kernel Mean Matching , 2012, 2012 11th International Conference on Machine Learning and Applications.

[20]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Tamás D. Gedeon,et al.  Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Shan Li,et al.  A Deeper Look at Facial Expression Dataset Bias , 2019, IEEE Transactions on Affective Computing.

[25]  Shiguang Shan,et al.  AU-inspired Deep Networks for Facial Expression Feature Learning , 2015, Neurocomputing.

[26]  Xiu-Shen Wei,et al.  Multi-Label Image Recognition With Graph Convolutional Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[28]  Xiaoou Tang,et al.  Learning Social Relation Traits from Face Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[29]  Fabien Ringeval,et al.  SEWA DB: A Rich Database for Audio-Visual Emotion and Sentiment Research in the Wild , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Hefeng Wu,et al.  Adversarial Graph Representation Adaptation for Cross-Domain Facial Expression Recognition , 2020, ACM Multimedia.

[31]  Yuxiao Hu,et al.  MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition , 2016, ECCV.

[32]  Michael I. Jordan,et al.  Conditional Adversarial Domain Adaptation , 2017, NeurIPS.

[33]  Xiaoou Tang,et al.  From Facial Expression Recognition to Interpersonal Relation Prediction , 2016, International Journal of Computer Vision.

[34]  Xuelong Li,et al.  DISC: Deep Image Saliency Computing via Progressive Representation Learning , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[35]  Liang Lin,et al.  Multi-label Image Recognition by Recurrently Discovering Attentional Regions , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  Shiguang Shan,et al.  Facial Expression Recognition with Inconsistently Annotated Datasets , 2018, ECCV.

[37]  Mingzhi Mao,et al.  Online Metro Origin-Destination Prediction via Heterogeneous Information Aggregation , 2021, ArXiv.

[38]  Yuan Xie,et al.  Facial Landmark Machines: A Backbone-Branches Architecture With Progressive Representation Learning , 2018, IEEE Transactions on Multimedia.

[39]  Xiaonan Luo,et al.  Knowledge-Embedded Representation Learning for Fine-Grained Image Recognition , 2018, IJCAI.

[40]  Liang Lin,et al.  Recurrent Attentional Reinforcement Learning for Multi-label Image Recognition , 2017, AAAI.

[41]  Matti Pietikäinen,et al.  Facial expression recognition from near-infrared videos , 2011, Image Vis. Comput..

[42]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[43]  Liang Lin,et al.  Knowledge-Embedded Routing Network for Scene Graph Generation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Junsong Yuan,et al.  Exploiting Local Feature Patterns for Unsupervised Domain Adaptation , 2018, AAAI.

[45]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[46]  M. Pantic,et al.  Induced Disgust , Happiness and Surprise : an Addition to the MMI Facial Expression Database , 2010 .

[47]  Hélio Pedrini,et al.  Effects of cultural characteristics on building an emotion classifier through facial expression analysis , 2015, J. Electronic Imaging.

[48]  Nicu Sebe,et al.  We are not All Equal: Personalizing Models for Facial Expression Analysis with Transductive Parameter Transfer , 2014, ACM Multimedia.

[49]  Haibin Yan,et al.  Transfer subspace learning for cross-dataset facial expression recognition , 2016, Neurocomputing.

[50]  Hefeng Wu,et al.  Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[52]  Chen-Yu Lee,et al.  Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Michael J. Lyons,et al.  Coding facial expressions with Gabor wavelets , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[54]  Ali Farhadi,et al.  Visual Semantic Navigation using Scene Priors , 2018, ICLR.

[55]  Liang Lin,et al.  Larger Norm More Transferable: An Adaptive Feature Norm Approach for Unsupervised Domain Adaptation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[56]  Cheng Lu,et al.  Cross-Domain Facial Expression Recognition Based on Transductive Deep Transfer Learning , 2019, IEEE Access.

[57]  Sonali T. Saste,et al.  Emotion recognition from speech using MFCC and DWT for security system , 2017, 2017 International conference of Electronics, Communication and Aerospace Technology (ICECA).

[58]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  P. Pattison,et al.  Emotion recognition via facial expression and affective prosody in schizophrenia: a methodological review. , 2002, Clinical psychology review.

[60]  Rodrigo Ferreira Berriel,et al.  Cross-Database Facial Expression Recognition Based on Fine-Tuned Deep Convolutional Network , 2017, 2017 30th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI).

[61]  Fernando De la Torre,et al.  Selective Transfer Machine for Personalized Facial Expression Analysis , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[62]  Hefeng Wu,et al.  Learning Semantic-Specific Graph Representation for Multi-Label Image Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[63]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[64]  Xiaonan Luo,et al.  Learning to Segment Object Candidates via Recursive Neural Networks , 2016, IEEE Transactions on Image Processing.

[65]  Liang Lin,et al.  Hybrid Knowledge Routed Modules for Large-scale Object Detection , 2018, NeurIPS.

[66]  Mohammad H. Mahoor,et al.  Facial Expression Recognition Using Enhanced Deep 3D Convolutional Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[67]  Philip S. Yu,et al.  Stratified Transfer Learning for Cross-domain Activity Recognition , 2017, 2018 IEEE International Conference on Pervasive Computing and Communications (PerCom).

[68]  Shan Li,et al.  Reliable Crowdsourcing and Deep Locality-Preserving Learning for Unconstrained Facial Expression Recognition , 2019, IEEE Transactions on Image Processing.

[69]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[70]  Yoshua Bengio,et al.  Challenges in representation learning: A report on three machine learning contests , 2013, Neural Networks.

[71]  Liang Lin,et al.  Knowledge Graph Transfer Network for Few-Shot Recognition , 2019, AAAI.

[72]  Zhen Cui,et al.  Cross-Database Facial Expression Recognition via Unsupervised Domain Adaptive Dictionary Learning , 2016, ICONIP.

[73]  Liming Chen,et al.  Unsupervised Domain Adaptation with Regularized Optimal Transport for Multimodal 2D+3D Facial Expression Recognition , 2018, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[74]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[75]  Hui Cheng,et al.  Deep Reasoning with Knowledge Graph for Social Relationship Understanding , 2018, IJCAI.