Geometry-Contrastive GAN for Facial Expression Transfer.

In this paper, we propose a Geometry-Contrastive Generative Adversarial Network (GC-GAN) for transferring continuous emotions across different subjects. Given an input face with certain emotion and a target facial expression from another subject, GC-GAN can generate an identity-preserving face with the target expression. Geometry information is introduced into cGANs as continuous conditions to guide the generation of facial expressions. In order to handle the misalignment across different subjects or emotions, contrastive learning is used to transform geometry manifold into an embedded semantic manifold of facial expressions. Therefore, the embedded geometry is injected into the latent space of GANs and control the emotion generation effectively. Experimental results demonstrate that our proposed method can be applied in facial expression transfer even there exist big differences in facial shapes and expressions between different subjects.

[1]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[2]  Tieniu Tan,et al.  Geometry Guided Adversarial Facial Expression Synthesis , 2017, ACM Multimedia.

[3]  Nicu Sebe,et al.  Every Smile is Unique: Landmark-Guided Diverse Smile Generation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Yong Du,et al.  Facial Expression Recognition Based on Deep Evolutional Spatial-Temporal Networks , 2017, IEEE Transactions on Image Processing.

[5]  Maja Pantic,et al.  GAGAN: Geometry-Aware Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Justus Thies,et al.  Face2Face: real-time face capture and reenactment of RGB videos , 2019, Commun. ACM.

[7]  Lijun Yin,et al.  A high-resolution 3D dynamic facial expression database , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[8]  Leonid Sigal,et al.  Facial Expression Transfer with Input-Output Temporal Restricted Boltzmann Machines , 2011, NIPS.

[9]  Wojciech Matusik,et al.  Video face replacement , 2011, ACM Trans. Graph..

[10]  Luc Van Gool,et al.  Pose Guided Person Image Generation , 2017, NIPS.

[11]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[12]  Xiaoming Liu,et al.  Coefficients Pose-Variant Input Recogni 8 on Engine Frontalized Output Generator FF-GAN D Discriminator Extreme Pose Input Frontalized Output , 2017 .

[13]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[14]  Ole Winther,et al.  Autoencoding beyond pixels using a learned similarity metric , 2015, ICML.

[15]  Justus Thies,et al.  Real-time expression transfer for facial reenactment , 2015, ACM Trans. Graph..

[16]  Daniel Cohen-Or,et al.  Bringing portraits to life , 2017, ACM Trans. Graph..

[17]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Rama Chellappa,et al.  ExprGAN: Facial Expression Editing with Controllable Expression Intensity , 2017, AAAI.

[21]  Yang Song,et al.  Age Progression/Regression by Conditional Adversarial Autoencoder , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Frédo Durand,et al.  Synthesizing Images of Humans in Unseen Poses , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Serge J. Belongie,et al.  Conditional Similarity Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Bernt Schiele,et al.  Learning What and Where to Draw , 2016, NIPS.

[25]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Katherine B. Martin,et al.  Facial Action Coding System , 2015 .

[27]  Jian Wang,et al.  Deep Metric Learning with Angular Loss , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Guo-Jun Qi,et al.  Loss-Sensitive Generative Adversarial Networks on Lipschitz Densities , 2017, International Journal of Computer Vision.

[29]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[30]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[31]  Bertram E. Shi,et al.  Photorealistic facial expression synthesis by the conditional difference adversarial autoencoder , 2017, 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII).