Learning to Detect Genuine versus Posed Pain from Facial Expressions using Residual Generative Adversarial Networks

We present a novel approach based on Residual Generative Adversarial Network (R-GAN) to discriminate genuine pain expression from posed pain expression by magnifying the subtle changes in the face. In addition to the adversarial task, the discriminator network in R-GAN estimates the intensity level of the pain. Moreover, we propose a novel Weighted Spatiotemporal Pooling (WSP) to capture and encode the appearance and dynamic of a given video sequence into an image map. In this way, we are able to transform any video into an image map embedding subtle variations in the facial appearance and dynamics. This allows using any pre-trained model on still images for video analysis. Our extensive experiments show that our proposed framework achieves promising results compared to state-of-the-art approaches on three benchmark databases, i.e., UNBC-McMaster Shoulder Pain, BioVid Head Pain, and STOIC.

[1]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Fan Yang,et al.  Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Ioannis Pitas,et al.  Texture and shape information fusion for facial expression and facial action unit recognition , 2008, Pattern Recognit..

[4]  Louis-Philippe Morency,et al.  Are You Friendly or Just Polite? - Analysis of Smiles in Spontaneous Face-to-Face Interactions , 2011, ACII.

[5]  A. Manstead,et al.  Can Duchenne smiles be feigned? New evidence on felt and false smiles. , 2009, Emotion.

[6]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[7]  Jeffrey F. Cohn,et al.  Painful data: The UNBC-McMaster shoulder pain expression archive database , 2011, Face and Gesture 2011.

[8]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[9]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[10]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[11]  Liang Wang,et al.  Bidirectional Recurrent Convolutional Networks for Multi-Frame Super-Resolution , 2015, NIPS.

[12]  K. Craig,et al.  Detecting deception in pain expressions: the structure of genuine and deceptive facial displays , 2002, Pain.

[13]  Jeffrey F. Cohn,et al.  The Timing of Facial Motion in posed and Spontaneous Smiles , 2003, Int. J. Wavelets Multiresolution Inf. Process..

[14]  Gwen Littlewort,et al.  Automatic coding of facial expressions displayed during posed and genuine pain , 2009, Image Vis. Comput..

[15]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[16]  Aggelos K. Katsaggelos,et al.  Video Super-Resolution With Convolutional Neural Networks , 2016, IEEE Transactions on Computational Imaging.

[17]  Xiaohui Yuan,et al.  Conditional convolution neural network enhanced random forest for facial expression recognition , 2018, Pattern Recognit..

[18]  Nuno Vasconcelos,et al.  How many bits does it take for a stimulus to be salient? , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Ayoub Al-Hamadi,et al.  The biovid heat pain database data for the advancement and systematic validation of an automated pain recognition system , 2013, 2013 IEEE International Conference on Cybernetics (CYBCO).

[20]  Mohammad H. Mahoor,et al.  DISFA: A Spontaneous Facial Action Intensity Database , 2013, IEEE Transactions on Affective Computing.

[21]  J. Cohn,et al.  Movement Differences between Deliberate and Spontaneous Facial Expressions: Zygomaticus Major Action in Smiling , 2006, Journal of nonverbal behavior.

[22]  Doina Precup,et al.  Hierarchical Spatio-Temporal Probabilistic Graphical Model with Multiple Feature Fusion for Binary Facial Attribute Classification in Real-World Face Videos , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[24]  B. Depaulo,et al.  Accuracy of Deception Judgments , 2006, Personality and social psychology review : an official journal of the Society for Personality and Social Psychology, Inc.

[25]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Shuicheng Yan,et al.  Video super-resolution based on spatial-temporal recurrent residual networks , 2017, Comput. Vis. Image Underst..

[27]  Aída Gutiérrez-García,et al.  Discrimination thresholds for smiles in genuine versus blended facial expressions , 2015 .

[28]  Sungsoo Park,et al.  Spontaneous facial expression classification with facial motion vectors , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[29]  Zheru Chi,et al.  Facial Expression Recognition in Video with Multiple Feature Fusion , 2018, IEEE Transactions on Affective Computing.