Cycle In Cycle Generative Adversarial Networks for Keypoint-Guided Image Generation

In this work, we propose a novel Cycle In Cycle Generative Adversarial Network (C2GAN) for the task of keypoint-guided image generation. The proposed C2GAN is a cross-modal framework exploring a joint exploitation of the keypoint and the image data in an interactive manner. C2GAN contains two different types of generators, i.e., keypoint-oriented generator and image-oriented generator. Both of them are mutually connected in an end-to-end learnable fashion and explicitly form three cycled sub-networks, i.e., one image generation cycle and two keypoint generation cycles. Each cycle not only aims at reconstructing the input domain, and also produces useful output involving in the generation of another cycle. By so doing, the cycles constrain each other implicitly, which provides complementary information from the two different modalities and brings extra supervision across cycles, thus facilitating more robust optimization of the whole network. Extensive experimental results on two publicly available datasets, i.e., Radboud Faces and Market-1501, demonstrate that our approach is effective to generate more photo-realistic images compared with state-of-the-art models.

[1]  Nicu Sebe,et al.  Every Smile is Unique: Landmark-Guided Diverse Smile Generation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Dacheng Tao,et al.  Attention-GAN for Object Transfiguration in Wild Images , 2018, ECCV.

[3]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[4]  Luc Van Gool,et al.  Natural and Effective Obfuscation by Head Inpainting , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[6]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[7]  Augustus Odena,et al.  Semi-Supervised Learning with Generative Adversarial Networks , 2016, ArXiv.

[8]  Bogdan Raducanu,et al.  Invertible Conditional GANs for image editing , 2016, ArXiv.

[9]  Nando de Freitas,et al.  Generating Interpretable Images with Controllable Structure , 2017 .

[10]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Jan Kautz,et al.  Video-to-Video Synthesis , 2018, NeurIPS.

[13]  Mahadev Satyanarayanan,et al.  OpenFace: A general-purpose face recognition library with mobile applications , 2016 .

[14]  Yi Yang,et al.  GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data , 2017, BMVC 2017.

[15]  Lior Wolf,et al.  Unsupervised Cross-Domain Image Generation , 2016, ICLR.

[16]  Luc Van Gool,et al.  Disentangled Person Image Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Nicu Sebe,et al.  Multi-Channel Attention Selection GAN With Cascaded Semantic Guidance for Cross-View Image Translation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Tao Mei,et al.  DA-GAN: Instance-Level Image Translation by Deep Attention Generative Adversarial Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Kwang In Kim,et al.  Unsupervised Attention-guided Image to Image Translation , 2018, NeurIPS.

[20]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22]  Abhinav Gupta,et al.  Generative Image Modeling Using Style and Structure Adversarial Networks , 2016, ECCV.

[23]  Luc Van Gool,et al.  ComboGAN: Unrestrained Scalability for Image Domain Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[24]  Nicu Sebe,et al.  Deformable GANs for Pose-Based Human Image Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Skyler T. Hawk,et al.  Presentation and validation of the Radboud Faces Database , 2010 .

[26]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Lucas Theis,et al.  Fast Face-Swap Using Convolutional Neural Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Lantao Yu,et al.  SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[29]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[31]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[32]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[33]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Taesung Park,et al.  Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Nicu Sebe,et al.  Attention-Guided Generative Adversarial Networks for Unsupervised Image-to-Image Translation , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[36]  Luc Van Gool,et al.  Pose Guided Person Image Generation , 2017, NIPS.

[37]  Alexei A. Efros,et al.  Everybody Dance Now , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[38]  Jinwoo Shin,et al.  InstaGAN: Instance-aware Image-to-Image Translation , 2018, ICLR.

[39]  Hugo Latapie,et al.  Cascade Attention Guided Residue Learning GAN for Cross-Modal Translation , 2019, 2020 25th International Conference on Pattern Recognition (ICPR).

[40]  Vishal M. Patel,et al.  GP-GAN: Gender Preserving GAN for Synthesizing Faces from Landmarks , 2017, 2018 24th International Conference on Pattern Recognition (ICPR).

[41]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[42]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Nicu Sebe,et al.  GazeCorrection: Self-Guided Eye Manipulation in the wild using Self-Supervised Generative Adversarial Networks , 2019, ArXiv.

[44]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[45]  Tieniu Tan,et al.  Geometry Guided Adversarial Facial Expression Synthesis , 2017, ACM Multimedia.

[46]  Nicu Sebe,et al.  Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation , 2018, ACCV.

[47]  Bingbing Ni,et al.  Skeleton-Aided Articulated Motion Generation , 2017, ACM Multimedia.

[48]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[49]  Nicu Sebe,et al.  GestureGAN for Hand Gesture-to-Gesture Translation in the Wild , 2018, ACM Multimedia.

[50]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[51]  Hui Chen,et al.  Geometry-Contrastive Generative Adversarial Network for Facial Expression Synthesis , 2018, ArXiv.

[52]  Ruslan Salakhutdinov,et al.  Generating Images from Captions with Attention , 2015, ICLR.

[53]  Yann LeCun,et al.  Deep multi-scale video prediction beyond mean square error , 2015, ICLR.

[54]  Cristian Canton-Ferrer,et al.  Eye In-painting with Exemplar Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[56]  Antonio Torralba,et al.  Generating Videos with Scene Dynamics , 2016, NIPS.

[57]  Hyunsoo Kim,et al.  Learning to Discover Cross-Domain Relations with Generative Adversarial Networks , 2017, ICML.

[58]  Bernt Schiele,et al.  Learning What and Where to Draw , 2016, NIPS.

[59]  Ali Borji,et al.  Cross-View Image Synthesis Using Conditional GANs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[60]  Nicu Sebe,et al.  Animating Arbitrary Objects via Deep Motion Transfer , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).