Cross-dataset learning and person-specific normalisation for automatic Action Unit detection

Automatic detection of Facial Action Units (AUs) is crucial for facial analysis systems. Due to the large individual differences, performance of AU classifiers depends largely on training data and the ability to estimate facial expressions of a neutral face. In this paper, we present a real-time Facial Action Unit intensity estimation and occurrence detection system based on appearance (Histograms of Oriented Gradients) and geometry features (shape parameters and landmark locations). Our experiments show the benefits of using additional labelled data from different datasets, which demonstrates the generalisability of our approach. This holds both when training for a specific dataset or when a generic model is needed. We also demonstrate the benefits of using a simple and efficient median based feature normalisation technique that accounts for person-specific neutral expressions. Finally, we show that our results outperform the FERA 2015 baselines in all three challenge tasks - AU occurrence detection, fully automatic AU intensity and pre-segmented AU intensity estimation.

[1]  Björn W. Schuller,et al.  AVEC 2011-The First International Audio/Visual Emotion Challenge , 2011, ACII.

[2]  Lijun Yin,et al.  FERA 2015 - second Facial Expression Recognition and Analysis challenge , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[3]  Stefanos Zafeiriou,et al.  300 Faces in-the-Wild Challenge: The First Facial Landmark Localization Challenge , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[4]  Maja Pantic,et al.  The first facial expression recognition and analysis challenge , 2011, Face and Gesture 2011.

[5]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[6]  Qiang Ji,et al.  Data-Free Prior Model for Facial Action Unit Recognition , 2013, IEEE Transactions on Affective Computing.

[7]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Andrea Cavallaro,et al.  Automatic Analysis of Facial Affect: A Survey of Registration, Representation, and Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Timothy F. Cootes,et al.  Feature Detection and Tracking with Constrained Local Models , 2006, BMVC.

[10]  Peter Robinson,et al.  Computation of emotions in man and machines , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[11]  Mohammad H. Mahoor,et al.  Social risk and depression: Evidence from manual and automatic facial expression analysis , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[12]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[13]  Fernando De la Torre,et al.  Continuous AU intensity estimation using localized, sparse facial feature space , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[14]  Mohammad H. Mahoor,et al.  DISFA: A Spontaneous Facial Action Intensity Database , 2013, IEEE Transactions on Affective Computing.

[15]  Donald Neth,et al.  Emotion perception in emotionless face images suggests a norm-based representation. , 2009, Journal of vision.

[16]  Shaun J. Canavan,et al.  BP4D-Spontaneous: a high-resolution spontaneous 3D dynamic facial expression database , 2014, Image Vis. Comput..

[17]  Adrian Hilton,et al.  Visual Analysis of Humans - Looking at People , 2013 .

[18]  Davis E. King,et al.  Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..

[19]  Daniel McDuff,et al.  Predicting online media effectiveness based on smile responses gathered over the Internet , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[20]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[21]  Peter Robinson,et al.  Natural affect data — Collection & annotation in a learning context , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[22]  Fernando De la Torre,et al.  Selective Transfer Machine for Personalized Facial Action Unit Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Peter Robinson,et al.  Constrained Local Neural Fields for Robust Facial Landmark Detection in the Wild , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[24]  Maja Pantic,et al.  The SEMAINE corpus of emotionally coloured character interactions , 2010, 2010 IEEE International Conference on Multimedia and Expo.

[25]  Marian Stewart Bartlett,et al.  Action unit recognition transfer across datasets , 2011, Face and Gesture 2011.

[26]  Fernando De la Torre,et al.  Facial Expression Analysis , 2011, Visual Analysis of Humans.

[27]  Gwen Littlewort,et al.  Real Time Face Detection and Facial Expression Recognition: Development and Applications to Human Computer Interaction. , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.