A recurrent cycle consistency loss for progressive face-to-face synthesis

This paper addresses a major flaw of the cycle consistency loss when used to preserve the input appearance in the face-to-face synthesis domain. In particular, we show that the images generated by a network trained using this loss conceal a noise that hinders their use for further tasks. To overcome this limitation, we propose a “recurrent cycle consistency loss” which for different sequences of target attributes minimises the distance between the output images, independent of any intermediate step. We empirically validate not only that our loss enables the re-use of generated images, but that it also improves their quality. In addition, we propose the very first network that covers the task of unconstrained landmark-guided face-to-face synthesis. Contrary to previous works, our proposed approach enables the transfer of a particular set of input features to a large span of poses and expressions, whereby the target landmarks become the ground-truth points. We then evaluate the consistency of our proposed approach to synthesise faces at the target landmarks. To the best of our knowledge, we are the first to propose a loss to overcome the limitation of the cycle consistency loss, and the first to propose an “in-the-wild” landmark guided synthesis approach. Code and models for this paper can be found in https://github.com/ESanchezLozano/GANnotation.

[1]  Eric Dubois,et al.  Image up-sampling using total-variation regularization with a new observation model , 2005, IEEE Transactions on Image Processing.

[2]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[4]  Nicu Sebe,et al.  Every Smile is Unique: Landmark-Guided Diverse Smile Generation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[6]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[7]  Shaun J. Canavan,et al.  BP4D-Spontaneous: a high-resolution spontaneous 3D dynamic facial expression database , 2014, Image Vis. Comput..

[8]  Stefanos Zafeiriou,et al.  The First Facial Landmark Tracking in-the-Wild Challenge: Benchmark and Results , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[9]  Vishal M. Patel,et al.  GP-GAN: Gender Preserving GAN for Synthesizing Faces from Landmarks , 2017, 2018 24th International Conference on Pattern Recognition (ICPR).

[10]  Fernando De la Torre,et al.  A Functional Regression Approach to Facial Landmark Tracking , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Thomas S. Huang,et al.  Interactive Facial Feature Localization , 2012, ECCV.

[13]  Francesc Moreno-Noguer,et al.  GANimation: Anatomically-aware Facial Animation from a Single Image , 2018, ECCV.

[14]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[15]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Zhenan Sun,et al.  Pose-Guided Photorealistic Face Rotation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Hui Chen,et al.  Geometry-Contrastive Generative Adversarial Network for Facial Expression Synthesis , 2018, ArXiv.

[19]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[20]  David J. Kriegman,et al.  Localizing Parts of Faces Using a Consensus of Exemplars , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Mario Lucic,et al.  Are GANs Created Equal? A Large-Scale Study , 2017, NeurIPS.

[23]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[24]  Stefanos Zafeiriou,et al.  A Semi-automatic Methodology for Facial Landmark Annotation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[25]  Ran He,et al.  Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[26]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[27]  Michel F. Valstar,et al.  Cascaded Continuous Regression for Real-time Incremental Face Tracking , 2016, ECCV.

[28]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[29]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[30]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Jae Hyun Lim,et al.  Geometric GAN , 2017, ArXiv.

[32]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[33]  Fang Zhao,et al.  Dual-Agent GANs for Photorealistic and Identity Preserving Profile Face Synthesis , 2017, NIPS.

[34]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[35]  Luc Van Gool,et al.  Pose Guided Person Image Generation , 2017, NIPS.

[36]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[37]  Léon Bottou,et al.  Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[38]  Xiaoming Liu,et al.  Coefficients Pose-Variant Input Recogni 8 on Engine Frontalized Output Generator FF-GAN D Discriminator Extreme Pose Input Frontalized Output , 2017 .

[39]  Georgios Tzimiropoulos,et al.  Super-FAN: Integrated Facial Landmark Localization and Super-Resolution of Real-World Low Resolution Faces in Arbitrary Poses with GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[41]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[42]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[44]  Georgios Tzimiropoulos,et al.  Object landmark discovery through unsupervised adaptation , 2019, NeurIPS.

[45]  Yu Tian,et al.  CR-GAN: Learning Complete Representations for Multi-view Generation , 2018, IJCAI.

[46]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Lijun Yin,et al.  FERA 2015 - second Facial Expression Recognition and Analysis challenge , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[48]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).