AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild

Automated affective computing in the wild setting is a challenging problem in computer vision. Existing annotated databases of facial expressions in the wild are small and mostly cover discrete emotions (aka the categorical model). There are very limited annotated facial databases for affective computing in the continuous dimensional model (e.g., valence and arousal). To meet this need, we collected, annotated, and prepared for public distribution a new database of facial emotions in the wild (called AffectNet). AffectNet contains more than 1,000,000 facial images from the Internet by querying three major search engines using 1,250 emotion related keywords in six different languages. About half of the retrieved images were manually annotated for the presence of seven discrete facial expressions and the intensity of valence and arousal. AffectNet is by far the largest database of facial expression, valence, and arousal in the wild enabling research in automated facial expression recognition in two different emotion models. Two baseline deep neural networks are used to classify images in the categorical model and predict the intensity of valence and arousal. Various evaluation metrics show that our deep neural network baselines can perform better than conventional machine learning methods and off-the-shelf facial expression recognition systems.

[1]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[2]  Klaus Krippendorff,et al.  Estimating the Reliability, Systematic Error and Random Error of Interval Data , 1970 .

[3]  P. Ekman,et al.  Constants across cultures in the face and emotion. , 1971, Journal of personality and social psychology.

[4]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[5]  J. Fleiss,et al.  Intraclass correlations: uses in assessing rater reliability. , 1979, Psychological bulletin.

[6]  J. Russell A circumplex model of affect. , 1980 .

[7]  P. Ekman,et al.  EMFACS-7: Emotional Facial Action Coding System , 1983 .

[8]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[9]  M. Bradley,et al.  Measuring emotion: the Self-Assessment Manikin and the Semantic Differential. , 1994, Journal of behavior therapy and experimental psychiatry.

[10]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[11]  Michael J. Lyons,et al.  Coding facial expressions with Gabor wavelets , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[12]  Takeo Kanade,et al.  Recognizing Action Units for Facial Expression Analysis , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Sergio Bermejo,et al.  Oriented principal component analysis for large margin classifiers , 2001, Neural Networks.

[14]  Chengjun Liu,et al.  Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition , 2002, IEEE Trans. Image Process..

[15]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[16]  Jeffrey F. Cohn,et al.  The Timing of Facial Motion in posed and Spontaneous Smiles , 2003, Int. J. Wavelets Multiresolution Inf. Process..

[17]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[18]  Maja Pantic,et al.  Web-based database for facial expression analysis , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[19]  Tieniu Tan,et al.  Affective Computing: A Review , 2005, ACII.

[20]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[21]  Maja Pantic,et al.  Spontaneous vs. posed facial behavior: automatic analysis of brow actions , 2006, ICMI '06.

[22]  Stan Szpakowicz,et al.  Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation , 2006, Australian Conference on Artificial Intelligence.

[23]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[24]  Kostas Karpouzis,et al.  User and context adaptive neural networks for emotion recognition , 2008, Neurocomputing.

[25]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[26]  Shaogang Gong,et al.  Facial expression recognition based on Local Binary Patterns: A comprehensive study , 2009, Image Vis. Comput..

[27]  Hatice Gunes,et al.  Audio-Visual Classification and Fusion of Spontaneous Affective Data in Likelihood Space , 2010, 2010 20th International Conference on Pattern Recognition.

[28]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[29]  Björn W. Schuller,et al.  AVEC 2011-The First International Audio/Visual Emotion Challenge , 2011, ACII.

[30]  Tamás D. Gedeon,et al.  Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[31]  Hatice Gunes,et al.  Continuous Prediction of Spontaneous Affect from Multiple Cues and Modalities in Valence-Arousal Space , 2011, IEEE Transactions on Affective Computing.

[32]  Aleix M. Martínez,et al.  Kernel Optimization in Discriminant Analysis , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Jeffrey F. Cohn,et al.  Painful data: The UNBC-McMaster shoulder pain expression archive database , 2011, Face and Gesture 2011.

[34]  Thierry Pun,et al.  DEAP: A Database for Emotion Analysis ;Using Physiological Signals , 2012, IEEE Transactions on Affective Computing.

[35]  Maja Pantic,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING , 2022 .

[36]  Mohammad H. Mahoor,et al.  Bidirectional Warping of Active Appearance Model , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[37]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[38]  Margaret McRorie,et al.  The Belfast Induced Natural Emotion Database , 2012, IEEE Transactions on Affective Computing.

[39]  Björn W. Schuller,et al.  AVEC 2012: the continuous audio/visual emotion challenge , 2012, ICMI '12.

[40]  Fernando De la Torre,et al.  Facing Imbalanced Data--Recommendations for the Use of Performance Metrics , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[41]  Yichuan Tang,et al.  Deep Learning using Linear Support Vector Machines , 2013, 1306.0239.

[42]  Kristy Elizabeth Boyer,et al.  Automatically Recognizing Facial Expression: Predicting Engagement and Frustration , 2013, EDM.

[43]  Fabien Ringeval,et al.  Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[44]  Mike Thelwall,et al.  Seeing Stars of Valence and Arousal in Blog Posts , 2013, IEEE Transactions on Affective Computing.

[45]  Daniel McDuff,et al.  Affectiva-MIT Facial Expression Dataset (AM-FED): Naturalistic and Spontaneous Facial Expressions Collected "In-the-Wild" , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[46]  Björn W. Schuller,et al.  AVEC 2013: the continuous audio/visual emotion and depression recognition challenge , 2013, AVEC@ACM Multimedia.

[47]  Abhinav Dhall,et al.  Emotion recognition in the wild challenge 2013 , 2013, ICMI '13.

[48]  Yoshua Bengio,et al.  Challenges in representation learning: A report on three machine learning contests , 2013, Neural Networks.

[49]  Mohammad H. Mahoor,et al.  DISFA: A Spontaneous Facial Action Intensity Database , 2013, IEEE Transactions on Affective Computing.

[50]  Jian Sun,et al.  Face Alignment at 3000 FPS via Regressing Local Binary Features , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  Mohammad Reza Mohammadi,et al.  PCA-based dictionary building for accurate facial expression recognition via sparse representation , 2014, J. Vis. Commun. Image Represent..

[52]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  Yong Tao,et al.  Compound facial expressions of emotion , 2014, Proceedings of the National Academy of Sciences.

[54]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[55]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  Björn W. Schuller,et al.  AVEC 2014: 3D Dimensional Affect and Depression Recognition Challenge , 2014, AVEC '14.

[57]  M. Mahoor,et al.  Facial expression recognition using lp-norm MKL multiclass-SVM , 2015 .

[58]  Mohammad H. Mahoor,et al.  Facial expression recognition using lp\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${l}_{p}$$\end{document}-norm MKL , 2015, Machine Vision and Applications.

[59]  Fabien Ringeval,et al.  AVEC 2015: The 5th International Audio/Visual Emotion Challenge and Workshop , 2015, ACM Multimedia.

[60]  Dongmei Jiang,et al.  Multimodal Affective Dimension Prediction Using Deep Bidirectional Long Short-Term Memory Recurrent Neural Networks , 2015, AVEC@ACM Multimedia.

[61]  Mohammad H. Mahoor,et al.  Facial Expression Recognition from World Wild Web , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[62]  Mohammad H. Mahoor,et al.  Going deeper in facial expression recognition using deep neural networks , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[63]  Yuanliu Liu,et al.  Video-based emotion recognition using CNN-RNN and C3D hybrid networks , 2016, ICMI.

[64]  Stefanos Zafeiriou,et al.  300 Faces In-The-Wild Challenge: database and results , 2016, Image Vis. Comput..

[65]  Aleix M. Martínez,et al.  EmotioNet: An Accurate, Real-Time Algorithm for the Automatic Annotation of a Million Facial Expressions in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Fabien Ringeval,et al.  AVEC 2016: Depression, Mood, and Emotion Recognition Workshop and Challenge , 2016, AVEC@ACM Multimedia.

[67]  Guoying Zhao,et al.  Facial Affect “In-the-Wild”: A Survey and a New Database , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[68]  Seetha Hari,et al.  Learning From Imbalanced Data , 2019, Advances in Computer and Electrical Engineering.

[69]  P. Ekman,et al.  Facial action coding system , 2019 .