A Framework for Joint Estimation and Guided Annotation of Facial Action Unit Intensity

Manual annotation of facial action units (AUs) is highly tedious and time-consuming. Various methods for automatic coding of AUs have been proposed, however, their performance is still far below of that attained by expert human coders. Several attempts have been made to leverage these methods to reduce the burden of manual coding of AU activations (presence/absence). Nevertheless, this has not been exploited in the context of AU intensity coding, which is a far more difficult task. To this end, we propose an expertdriven probabilistic approach for joint modeling and estimation of AU intensities. Specifically, we introduce a Conditional Random Field model for joint estimation of the AU intensity that updates its predictions in an iterative fashion by relying on expert knowledge of human coders. We show in our experiments on two publicly available datasets of AU intensity (DISFA and FERA2015) that the AU coding process can significantly be facilitated by the proposed approach, allowing human coders to faster make decisions about target AU intensity.

[1]  Mohammad H. Mahoor,et al.  Facial action unit recognition with sparse representation , 2011, Face and Gesture 2011.

[2]  Vladimir Pavlovic,et al.  Context-Sensitive Dynamic Ordinal Regression for Intensity Estimation of Facial Action Units , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[4]  Qiang Ji,et al.  A unified probabilistic framework for measuring the intensity of spontaneous facial action units , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[5]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[6]  Trevor J. Hastie,et al.  Exact Covariance Thresholding into Connected Components for Large-Scale Graphical Lasso , 2011, J. Mach. Learn. Res..

[7]  Lijun Yin,et al.  FERA 2015 - second Facial Expression Recognition and Analysis challenge , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[8]  Daniel S. Messinger,et al.  A framework for automated measurement of the intensity of non-posed Facial Action Units , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[9]  Stefanos Zafeiriou,et al.  Markov Random Field Structures for Facial Action Unit Intensity Estimation , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[10]  Maja Pantic,et al.  Latent trees for estimating intensity of Facial Action Units , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Maja Pantic,et al.  Continuous Pain Intensity Estimation from Facial Expressions , 2012, ISVC.

[12]  Sridha Sridharan,et al.  Automatically Detecting Pain in Video Through Facial Action Units , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Fernando De la Torre,et al.  Selective Transfer Machine for Personalized Facial Action Unit Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Mohammad H. Mahoor,et al.  DISFA: A Spontaneous Facial Action Intensity Database , 2013, IEEE Transactions on Affective Computing.

[15]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[16]  Fernando De la Torre,et al.  Fast-FACS: A Computer-Assisted System to Increase Speed and Reliability of Manual FACS Coding , 2011, ACII.

[17]  J. Fleiss,et al.  Intraclass correlations: uses in assessing rater reliability. , 1979, Psychological bulletin.

[18]  Qiang Ji,et al.  Active Image Labeling and Its Application to Facial Action Labeling , 2008, ECCV.

[19]  Fernando De la Torre,et al.  Continuous AU intensity estimation using localized, sparse facial feature space , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[20]  Maja Pantic,et al.  Multi-conditional Latent Variable Model for Joint Facial Action Unit Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  S. Horvath Weighted Network Analysis: Applications in Genomics and Systems Biology , 2011 .

[22]  Jeff G. Schneider,et al.  A Composite Likelihood View for Multi-Label Classification , 2012, AISTATS.

[23]  Fernando De la Torre,et al.  Estimating smile intensity: A better way , 2015, Pattern Recognit. Lett..

[24]  Honggang Zhang,et al.  Joint patch and multi-label learning for facial action unit detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Zakia Hammal Efficient Detection of Consecutive Facial Expression Apices Using Biologically Based Log-Normal Filters , 2011, ISVC.

[26]  Katherine B. Martin,et al.  Facial Action Coding System , 2015 .

[27]  J. Cohn,et al.  Automated Face Analysis for Affective Computing , 2015 .

[28]  Steve Horvath,et al.  Weighted Network Analysis , 2011 .

[29]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[30]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.