Unique Class Group Based Multi-Label Balancing Optimizer for Action Unit Detection

Balancing methods for single-label data cannot be applied to multi-label problems as they would also resample the samples with high occurrences. We propose to reformulate this problem as an optimization problem in order to balance multi-label data. We apply this balancing algorithm to training datasets for detecting isolated facial movements, so-called Action Units. Several Action Units can describe combined emotions or physical states such as pain. As datasets in this area are limited and mostly imbalanced, we show how optimized balancing and then augmentation can improve Action Unit detection. At the IEEE Conference on Face and Gesture Recognition 2020, we ranked third in the Affective Behavior Analysis in-the-wild (ABAW) challenge for the Action Unit detection task.

[1]  Dimitrios Kollias,et al.  Analysing Affective Behavior in the First ABAW 2020 Competition , 2020, 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020).

[2]  Jorn Ostermann,et al.  Two-Stream Aural-Visual Affect Analysis in the Wild , 2020, 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020).

[3]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[4]  Christian Küblbeck,et al.  Face detection and tracking in video sequences using the modifiedcensus transformation , 2006, Image Vis. Comput..

[5]  Stefanos Zafeiriou,et al.  Aff-Wild2: Extending the Aff-Wild Database for Affect Recognition , 2018, ArXiv.

[6]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[7]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Dimitrios Kollias,et al.  Expression, Affect, Action Unit Recognition: Aff-Wild2, Multi-Task Learning and ArcFace , 2019, BMVC.

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Francisco Charte,et al.  Addressing imbalance in multilabel classification: Measures and random resampling algorithms , 2015, Neurocomputing.

[11]  Guoying Zhao,et al.  Aff-Wild: Valence and Arousal ‘In-the-Wild’ Challenge , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[12]  Stefanos Zafeiriou,et al.  A Multi-Task Learning & Generation Framework: Valence-Arousal, Action Units & Primary Expressions , 2018, ArXiv.

[13]  Mahadev Satyanarayanan,et al.  OpenFace: A general-purpose face recognition library with mobile applications , 2016 .

[14]  Aleix M. Martínez,et al.  EmotioNet: An Accurate, Real-Time Algorithm for the Automatic Annotation of a Million Facial Expressions in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Guoying Zhao,et al.  Recognition of Affect in the Wild Using Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[16]  P. Ekman Darwin, Deception, and Facial Expression , 2003, Annals of the New York Academy of Sciences.

[17]  Bettina Finzel,et al.  Verifying Deep Learning-based Decisions for Facial Expression Recognition , 2020, ESANN.

[18]  Bertram E. Shi,et al.  Multitask Emotion Recognition with Incomplete Labels , 2020, 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020).

[19]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[20]  Marcello Mortillaro,et al.  Emotion Expression from Different Angles: A Video Database for Facial Expressions of Actors Shot by a Camera Array , 2019, 2019 8th International Conference on Affective Computing and Intelligent Interaction (ACII).

[21]  Miriam Kunz,et al.  Facial muscle movements encoding pain—a systematic review , 2018, Pain.

[22]  Francisco Charte,et al.  MLSMOTE: Approaching imbalanced multilabel learning through synthetic instance generation , 2015, Knowl. Based Syst..

[23]  Maja Pantic,et al.  Automatic Analysis of Facial Actions: A Survey , 2019, IEEE Transactions on Affective Computing.

[24]  Thomas Hauenstein,et al.  Towards Real-Time Head Pose Estimation: Exploring Parameter-Reduced Residual Networks on In-the-wild Datasets , 2019, IEA/AIE.

[25]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[26]  Guoying Zhao,et al.  Deep Affect Prediction in-the-Wild: Aff-Wild Database and Challenge, Deep Architectures, and Beyond , 2018, International Journal of Computer Vision.