Multi-label Learning with Missing Values using Combined Facial Action Unit Datasets

Facial action units allow an objective, standardized description of facial micro movements which can be used to describe emotions in human faces. Annotating data for action units is an expensive and time-consuming task, which leads to a scarce data situation. By combining multiple datasets from different studies, the amount of training data for a machine learning algorithm can be increased in order to create robust models for automated, multi-label action unit detection. However, every study annotates different action units, leading to a tremendous amount of missing labels in a combined database. In this work, we examine this challenge and present our approach to create a combined database and an algorithm capable of learning under the presence of missing labels without inferring their values. Our approach shows competitive performance compared to recent competitions in action unit detection.

[1]  Wes McKinney,et al.  Data Structures for Statistical Computing in Python , 2010, SciPy.

[2]  Nora Al-Garaawi,et al.  Study on Aging Effect on Facial Expression Recognition , 2016 .

[3]  Jaspar Pahl,et al.  Unique Class Group Based Multi-Label Balancing Optimizer for Action Unit Detection , 2020, 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020).

[4]  Maja Pantic,et al.  Automatic Analysis of Facial Actions: A Survey , 2019, IEEE Transactions on Affective Computing.

[5]  David C. Howell Missing Values: How to Treat Them Appropriately , 2019 .

[6]  Shaun J. Canavan,et al.  BP4D-Spontaneous: a high-resolution spontaneous 3D dynamic facial expression database , 2014, Image Vis. Comput..

[7]  P. Ekman Darwin, Deception, and Facial Expression , 2003, Annals of the New York Academy of Sciences.

[8]  Sanjiv Kumar,et al.  On the Convergence of Adam and Beyond , 2018 .

[9]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[10]  Xilin Chen,et al.  Multi-label Co-regularization for Semi-supervised Facial Action Unit Recognition , 2019, NeurIPS.

[11]  Shangfei Wang,et al.  Weakly Supervised Facial Action Unit Recognition Through Adversarial Training , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Stefanos Zafeiriou,et al.  Aff-Wild2: Extending the Aff-Wild Database for Affect Recognition , 2018, ArXiv.

[13]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[14]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[15]  Fernando De la Torre,et al.  Learning facial action units with spatiotemporal cues and multi-label sampling , 2019, Image Vis. Comput..

[16]  Qiang Ji,et al.  Handling missing labels and class imbalance challenges simultaneously for facial action unit recognition , 2019, Multimedia Tools and Applications.

[17]  Peter Robinson,et al.  Cross-dataset learning and person-specific normalisation for automatic Action Unit detection , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[18]  David Sontag,et al.  Why Is My Classifier Discriminatory? , 2018, NeurIPS.

[19]  Dimitrios Kollias,et al.  Expression, Affect, Action Unit Recognition: Aff-Wild2, Multi-Task Learning and ArcFace , 2019, BMVC.

[20]  Jianfei Cai,et al.  Facial Action Unit Detection Using Attention and Relation Learning , 2018, IEEE Transactions on Affective Computing.

[21]  Aleix M. Martínez,et al.  EmotioNet: An Accurate, Real-Time Algorithm for the Automatic Annotation of a Million Facial Expressions in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Stefanos Zafeiriou,et al.  A Multi-Task Learning & Generation Framework: Valence-Arousal, Action Units & Primary Expressions , 2018, ArXiv.

[23]  Guoying Zhao,et al.  Recognition of Affect in the Wild Using Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[24]  Shaun J. Canavan,et al.  BP4D-Spontaneous: a high-resolution spontaneous 3D dynamic facial expression database , 2014, Image Vis. Comput..

[25]  Jeffrey F. Cohn,et al.  Painful data: The UNBC-McMaster shoulder pain expression archive database , 2011, Face and Gesture 2011.

[26]  Marcello Mortillaro,et al.  Emotion Expression from Different Angles: A Video Database for Facial Expressions of Actors Shot by a Camera Array , 2019, 2019 8th International Conference on Affective Computing and Intelligent Interaction (ACII).

[27]  Miriam Kunz,et al.  Facial muscle movements encoding pain—a systematic review , 2018, Pain.

[28]  Bertram E. Shi,et al.  Multitask Emotion Recognition with Incomplete Labels , 2020, 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020).

[29]  Dimitrios Kollias,et al.  Analysing Affective Behavior in the First ABAW 2020 Competition , 2020, 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020).

[30]  Qiang Ji,et al.  Multi-label Learning with Missing Labels , 2014, 2014 22nd International Conference on Pattern Recognition.

[31]  Guoying Zhao,et al.  Aff-Wild: Valence and Arousal ‘In-the-Wild’ Challenge , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[32]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[33]  Guoying Zhao,et al.  Deep Affect Prediction in-the-Wild: Aff-Wild Database and Challenge, Deep Architectures, and Beyond , 2018, International Journal of Computer Vision.