Copula Ordinal Regression for Joint Estimation of Facial Action Unit Intensity

Joint modeling of the intensity of facial action units (AUs) from face images is challenging due to the large number of AUs (30+) and their intensity levels (6). This is in part due to the lack of suitable models that can efficiently handle such a large number of outputs/classes simultaneously, but also due to the lack of labelled target data. For this reason, majority of the methods proposed so far resort to independent classifiers for the AU intensity. This is suboptimal for at least two reasons: the facial appearance of some AUs changes depending on the intensity of other AUs, and some AUs co-occur more often than others. Encoding this is expected to improve the estimation of target AU intensities, especially in the case of noisy image features, head-pose variations and imbalanced training data. To this end, we introduce a novel modeling framework, Copula Ordinal Regression (COR), that leverages the power of copula functions and CRFs, to detangle the probabilistic modeling of AU dependencies from the marginal modeling of the AU intensity. Consequently, the COR model achieves the joint learning and inference of intensities of multiple AUs, while being computationally tractable. We show on two challenging datasets of naturalistic facial expressions that the proposed approach consistently outperforms (i) independent modeling of AU intensities, and (ii) the state-of the-art approach for the target task.

[1]  Daniel S. Messinger,et al.  A framework for automated measurement of the intensity of non-posed Facial Action Units , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[2]  N. Reid,et al.  AN OVERVIEW OF COMPOSITE LIKELIHOOD METHODS , 2011 .

[3]  S. Horvath Weighted Network Analysis: Applications in Genomics and Systems Biology , 2011 .

[4]  Steve Horvath,et al.  Weighted Network Analysis , 2011 .

[5]  Jeff G. Schneider,et al.  A Composite Likelihood View for Multi-Label Classification , 2012, AISTATS.

[6]  P. McCullagh Analysis of Ordinal Categorical Data , 1985 .

[7]  Joel E. Pessa,et al.  Double or bifid zygomaticus major muscle: Anatomy, incidence, and clinical correlation , 1998, Clinical anatomy.

[8]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[9]  A. Sklar,et al.  Random variables, distribution functions, and copulas---a personal look backward and forward , 1996 .

[10]  Sridha Sridharan,et al.  Automatically Detecting Pain in Video Through Facial Action Units , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[11]  Fernando De la Torre,et al.  Selective Transfer Machine for Personalized Facial Action Unit Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[13]  Frank D. Wood,et al.  Characterizing neural dependencies with copula models , 2008, NIPS.

[14]  K. Pearson Biometrika , 1902, The American Naturalist.

[15]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[16]  Stefanos Zafeiriou,et al.  Markov Random Field Structures for Facial Action Unit Intensity Estimation , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[17]  A. Davison,et al.  Statistical Modeling of Spatial Extremes , 2012, 1208.3378.

[18]  Trevor J. Hastie,et al.  Exact Covariance Thresholding into Connected Components for Large-Scale Graphical Lasso , 2011, J. Mach. Learn. Res..

[19]  Vladimir Pavlovic,et al.  Context-Sensitive Dynamic Ordinal Regression for Intensity Estimation of Facial Action Units , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Qiang Ji,et al.  A unified probabilistic framework for measuring the intensity of spontaneous facial action units , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[21]  Honggang Zhang,et al.  Joint patch and multi-label learning for facial action unit detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  T. Louis,et al.  Inferences on the association parameter in copula models for bivariate survival data. , 1995, Biometrics.

[23]  Maja Pantic,et al.  Latent trees for estimating intensity of Facial Action Units , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Maja Pantic,et al.  Continuous Pain Intensity Estimation from Facial Expressions , 2012, ISVC.

[25]  Mohammad H. Mahoor,et al.  DISFA: A Spontaneous Facial Action Intensity Database , 2013, IEEE Transactions on Affective Computing.

[26]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[27]  Noah A. Smith,et al.  An Exact Dual Decomposition Algorithm for Shallow Semantic Parsing with Constraints , 2012, *SEMEVAL.

[28]  C. Genest Frank's family of bivariate distributions , 1987 .

[29]  Mohammad H. Mahoor,et al.  Facial action unit recognition with sparse representation , 2011, Face and Gesture 2011.

[30]  Vladimir Pavlovic,et al.  Structured Output Ordinal Regression for Dynamic Facial Emotion Intensity Prediction , 2010, ECCV.

[31]  Jeffrey F. Cohn,et al.  Painful data: The UNBC-McMaster shoulder pain expression archive database , 2011, Face and Gesture 2011.

[32]  Fernando De la Torre,et al.  Continuous AU intensity estimation using localized, sparse facial feature space , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[33]  Maja Pantic,et al.  Multi-conditional Latent Variable Model for Joint Facial Action Unit Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[34]  H. Friedl Econometric Analysis of Count Data , 2002 .