论文信息 - Face Translation between Images and Videos using Identity-aware CycleGAN

Face Translation between Images and Videos using Identity-aware CycleGAN

This paper presents a new problem of unpaired face translation between images and videos, which can be applied to facial video prediction and enhancement. In this problem there exist two major technical challenges: 1) designing a robust translation model between static images and dynamic videos, and 2) preserving facial identity during image-video translation. To address such two problems, we generalize the state-of-the-art image-to-image translation network (Cycle-Consistent Adversarial Networks) to the image-to-video/video-to-image translation context by exploiting a image-video translation model and an identity preservation model. In particular, we apply the state-of-the-art Wasserstein GAN technique to the setting of image-video translation for better convergence, and we meanwhile introduce a face verificator to ensure the identity. Experiments on standard image/video face datasets demonstrate the effectiveness of the proposed model in both terms of qualitative and quantitative evaluations.

Luc Van Gool | Zhiwu Huang | Jiqing Wu | Danda Pani Paudel | Bernhard Kratzwald

[1] Yann LeCun,et al. Energy-based Generative Adversarial Network , 2016, ICLR.

[2] Antonio Torralba,et al. Generating Videos with Scene Dynamics , 2016, NIPS.

[3] David Salesin,et al. Image Analogies , 2001, SIGGRAPH.

[4] Tal Hassner,et al. Face recognition in unconstrained videos with matched background similarity , 2011, CVPR 2011.

[5] Alexei A. Efros,et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6] Zhigang Li,et al. Generate Identity-Preserving Faces by Generative Adversarial Networks , 2017, ArXiv.

[7] Xiaogang Wang,et al. Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[8] Ming-Yu Liu,et al. Coupled Generative Adversarial Networks , 2016, NIPS.

[9] Chi-Keung Tang,et al. Conditional CycleGAN for Attribute Guided Face Image Generation , 2017, ArXiv.

[10] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[11] Yu Qiao,et al. Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[12] John E. Hopcroft,et al. Stacked Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[14] Jaakko Lehtinen,et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[15] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Bo Zhao,et al. Multi-View Image Generation from a Single-View , 2017, ACM Multimedia.

[17] Dimitris N. Metaxas,et al. StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[18] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[19] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.

[20] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.

[21] Xiaoming Liu,et al. Disentangled Representation Learning GAN for Pose-Invariant Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Brendan J. Frey,et al. Unsupervised image translation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[23] Chi-Keung Tang,et al. Attribute-Guided Face Generation Using Conditional CycleGAN , 2017, ECCV.

[24] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.

[25] Ming-Hsuan Yang,et al. Generative Face Completion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Tieniu Tan,et al. A Light CNN for Deep Face Representation With Noisy Labels , 2015, IEEE Transactions on Information Forensics and Security.

[27] Jan Kautz,et al. Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[28] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[29] Ran He,et al. Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30] David Berthelot,et al. BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[31] Raymond Y. K. Lau,et al. Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[32] Bruce A. Draper,et al. The challenge of face recognition from digital point-and-shoot cameras , 2013, 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS).

[33] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Jan Kautz,et al. MoCoGAN: Decomposing Motion and Content for Video Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.