A joint dictionary learning and regression model for intensity estimation of facial AUs

Abstract Automated intensity estimation of spontaneous Facial Action Units (AUs) defined by Facial Action Coding System (FACS) is a relatively new and challenging problem. This paper presents a joint supervised dictionary learning (SDL) and regression model for solving this problem. The model is casted as an optimization function consisting of two terms. The first term in the optimization concerns representing the facial images in a sparse domain using dictionary learning whereas the second term concerns estimating AU intensities using a linear regression model in the sparse domain. The regression model is designed in a way that considers disagreement between raters by a constant biasing factor in measuring the AU intensity values. Furthermore, since the intensity of facial AU is a non-negative value (i.e., the intensity values are between 0 and 5), we impose a non-negative constraint on the estimated intensities by restricting the search space for the dictionary learning and the regression function. Our experimental results on DISFA and FERA2015 databases show that this approach is very promising for automated measurement of spontaneous facial AUs.

[1]  Vladimir Pavlovic,et al.  Personalized Modeling of Facial Action Unit Intensity , 2014, ISVC.

[2]  R. Bro,et al.  A fast non‐negativity‐constrained least squares algorithm , 1997 .

[3]  Mohammad Reza Mohammadi,et al.  Non-negative sparse decomposition based on constrained smoothed ℓ0 norm , 2014, Signal Process..

[4]  Qiang Ji,et al.  Data-Free Prior Model for Facial Action Unit Recognition , 2013, IEEE Transactions on Affective Computing.

[5]  Vladimir Pavlovic,et al.  Context-Sensitive Dynamic Ordinal Regression for Intensity Estimation of Facial Action Units , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Qiang Ji,et al.  A Unified Probabilistic Framework for Spontaneous Facial Action Modeling and Understanding , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Qiang Ji,et al.  A unified probabilistic framework for measuring the intensity of spontaneous facial action units , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[8]  Mohamed Chetouani,et al.  Facial Action Unit intensity prediction via Hard Multi-Task Metric Learning for Kernel Regression , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[9]  Qiang Ji,et al.  Capturing Global Semantic Relationships for Facial Action Unit Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[10]  Baoxin Li,et al.  Discriminative K-SVD for dictionary learning in face recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Stefanos Zafeiriou,et al.  Markov Random Field Structures for Facial Action Unit Intensity Estimation , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[12]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[13]  Shaun J. Canavan,et al.  BP4D-Spontaneous: a high-resolution spontaneous 3D dynamic facial expression database , 2014, Image Vis. Comput..

[14]  Qi Jia,et al.  A sparse representation approach for local feature based expression recognition , 2011, 2011 International Conference on Multimedia Technology.

[15]  Ke Huang,et al.  Sparse Representation for Signal Classification , 2006, NIPS.

[16]  Mohammad H. Mahoor,et al.  Temporal Facial Expression Modeling for Automated Action Unit Intensity Measurement , 2014, 2014 22nd International Conference on Pattern Recognition.

[17]  Rama Chellappa,et al.  Sparse localized facial motion dictionary learning for facial expression recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Y. C. Pati,et al.  Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition , 1993, Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.

[19]  Ioannis Pitas,et al.  Texture and shape information fusion for facial expression and facial action unit recognition , 2008, Pattern Recognit..

[20]  Mohammad Reza Mohammadi,et al.  PCA-based dictionary building for accurate facial expression recognition via sparse representation , 2014, J. Vis. Commun. Image Represent..

[21]  Kjersti Engan,et al.  Multi-frame compression: theory and design , 2000, Signal Process..

[22]  Donghui Wang,et al.  A Brief Summary of Dictionary Learning Based Approach for Classification (revised) , 2012, ArXiv.

[23]  Ayoub Al-Hamadi,et al.  Handling Data Imbalance in Automatic Facial Action Intensity Estimation , 2015, BMVC.

[24]  Hamid Sadeghi,et al.  Facial expression recognition using geometric normalization and appearance representation , 2013, 2013 8th Iranian Conference on Machine Vision and Image Processing (MVIP).

[25]  Lijun Yin,et al.  FERA 2015 - second Facial Expression Recognition and Analysis challenge , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[26]  Nanning Zheng,et al.  Learning group-based dictionaries for discriminative image representation , 2014, Pattern Recognit..

[27]  Sudha Velusamy,et al.  Improved feature representation for robust facial action unit detection , 2013, 2013 IEEE 10th Consumer Communications and Networking Conference (CCNC).

[28]  Maja Pantic,et al.  The first facial expression recognition and analysis challenge , 2011, Face and Gesture 2011.

[29]  Mohammad Reza Mohammadi,et al.  Intensity Estimation of Spontaneous Facial Action Units Based on Their Sparsity Properties , 2016, IEEE Transactions on Cybernetics.

[30]  Edoardo Amaldi,et al.  On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems , 1998, Theor. Comput. Sci..

[31]  Luis Mateus Rocha,et al.  Singular value decomposition and principal component analysis , 2003 .

[32]  Mohammad H. Mahoor,et al.  Facial action unit recognition with sparse representation , 2011, Face and Gesture 2011.

[33]  Maja Pantic,et al.  Fully Automatic Facial Action Unit Detection and Temporal Analysis , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[34]  Andreas Savakis,et al.  Manifold based sparse representation for facial understanding in natural images , 2013, Image Vis. Comput..

[35]  Jacob Cohen,et al.  Applied multiple regression/correlation analysis for the behavioral sciences , 1979 .

[36]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Fernando De la Torre,et al.  Continuous AU intensity estimation using localized, sparse facial feature space , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[38]  Albertus Agung,et al.  NMF Coefficient and Bilinear Space Projection as Features For Human Facial Expression Recognition , 2012 .

[39]  Larry S. Davis,et al.  Learning a discriminative dictionary for sparse coding via label consistent K-SVD , 2011, CVPR 2011.

[40]  Mohammad H. Mahoor,et al.  Manifold alignment using curvature information , 2013, 2013 28th International Conference on Image and Vision Computing New Zealand (IVCNZ 2013).

[41]  Mohammad H. Mahoor,et al.  DISFA: A Spontaneous Facial Action Intensity Database , 2013, IEEE Transactions on Affective Computing.

[42]  Honggang Zhang,et al.  An adaptive group lasso based multi-label regression approach for facial expression analysis , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[43]  Michael Elad,et al.  Dictionaries for Sparse Representation Modeling , 2010, Proceedings of the IEEE.

[44]  Qiang Ji,et al.  Exploiting Dynamic Dependencies Among Action Units for Spontaneous Facial Action Recognition , 2015 .

[45]  Xinge You,et al.  Robust face recognition via occlusion dictionary learning , 2014, Pattern Recognit..

[46]  P. Ekman,et al.  Facial action coding system , 2019 .

[47]  J. Fleiss,et al.  Intraclass correlations: uses in assessing rater reliability. , 1979, Psychological bulletin.

[48]  Mohammad Reza Mohammadi,et al.  Simultaneous recognition of facial expression and identity via sparse representation , 2014, IEEE Winter Conference on Applications of Computer Vision.