Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

State-of-the-art methods for image-to-image translation with Generative Adversarial Networks (GANs) can learn a mapping from one domain to another domain using unpaired image data. However, these methods require the training of one specific model for every pair of image domains, which limits the scalability in dealing with more than two image domains. In addition, the training stage of these methods has the common problem of model collapse that degrades the quality of the generated images. To tackle these issues, we propose a Dual Generator Generative Adversarial Network (G$^2$GAN), which is a robust and scalable approach allowing to perform unpaired image-to-image translation for multiple domains using only dual generators within a single model. Moreover, we explore different optimization losses for better training of G$^2$GAN, and thus make unpaired image-to-image translation with higher consistency and better stability. Extensive experiments on six publicly available datasets with different scenarios, i.e., architectural buildings, seasons, landscape and human faces, demonstrate that the proposed G$^2$GAN achieves superior model capacity and better generation performance comparing with existing image-to-image translation GAN models.

[1]  Jun Wang,et al.  A 3D facial expression database for facial behavior research , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[2]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[3]  Ming-Yu Liu,et al.  Coupled Generative Adversarial Networks , 2016, NIPS.

[4]  Nicu Sebe,et al.  Every Smile is Unique: Landmark-Guided Diverse Smile Generation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Luc Van Gool,et al.  ComboGAN: Unrestrained Scalability for Image Domain Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[6]  Nicu Sebe,et al.  Recurrent Face Aging with Hierarchical AutoRegressive Memory , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Minh N. Do,et al.  Semantic Image Inpainting with Perceptual and Contextual Losses , 2016, ArXiv.

[9]  Fisher Yu,et al.  Scribbler: Controlling Deep Image Synthesis with Sketch and Color , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Aleix M. Martinez,et al.  The AR face database , 1998 .

[11]  Andrew Brock,et al.  Neural Photo Editing with Introspective Adversarial Networks , 2016, ICLR.

[12]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[13]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[14]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[15]  Alexei A. Efros,et al.  Toward Multimodal Image-to-Image Translation , 2017, NIPS.

[16]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[18]  Luc Van Gool,et al.  Disentangled Person Image Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Luc Van Gool,et al.  Pose Guided Person Image Generation , 2017, NIPS.

[20]  Chuan Li,et al.  Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks , 2016, ECCV.

[21]  Ruslan Salakhutdinov,et al.  Generating Images from Captions with Attention , 2015, ICLR.

[22]  Trung Le,et al.  Dual Discriminator Generative Adversarial Nets , 2017, NIPS.

[23]  Ersin Yumer,et al.  Neural Face Editing with Intrinsic Image Disentangling , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Eric P. Xing,et al.  Generative Semantic Manipulation with Mask-Contrasting GAN , 2018, ECCV.

[25]  Guo-Jun Qi,et al.  Loss-Sensitive Generative Adversarial Networks on Lipschitz Densities , 2017, International Journal of Computer Vision.

[26]  Eric P. Xing,et al.  Generative Semantic Manipulation with Contrasting GAN , 2017, ArXiv.

[27]  Nicu Sebe,et al.  GestureGAN for Hand Gesture-to-Gesture Translation in the Wild , 2018, ACM Multimedia.

[28]  Lior Wolf,et al.  Unsupervised Cross-Domain Image Generation , 2016, ICLR.

[29]  Radim Sára,et al.  Spatial Pattern Templates for Recognition of Objects with Regular Structure , 2013, GCPR.

[30]  Lior Wolf,et al.  One-Sided Unsupervised Domain Mapping , 2017, NIPS.

[31]  Nicu Sebe,et al.  Deformable GANs for Pose-Based Human Image Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Skyler T. Hawk,et al.  Presentation and validation of the Radboud Faces Database , 2010 .

[33]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[36]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[37]  Daniel Rueckert,et al.  Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Jan Kautz,et al.  Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[39]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[40]  Bogdan Raducanu,et al.  Invertible Conditional GANs for image editing , 2016, ArXiv.

[41]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[42]  Yong Yu,et al.  Face Transfer with Generative Adversarial Network , 2017, ArXiv.

[43]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Ersin Yumer,et al.  Transformation-Grounded Image Generation Network for Novel 3D View Synthesis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[46]  Ming-Hsuan Yang,et al.  Generative Face Completion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Liang Lin,et al.  BeautyGAN: Instance-level Facial Makeup Transfer with Deep Generative Adversarial Network , 2018, ACM Multimedia.

[48]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[49]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[50]  A. Martínez,et al.  The AR face databasae , 1998 .

[51]  Hyunsoo Kim,et al.  Learning to Discover Cross-Domain Relations with Generative Adversarial Networks , 2017, ICML.

[52]  Bernt Schiele,et al.  Learning What and Where to Draw , 2016, NIPS.

[53]  Minh N. Do,et al.  Semantic Image Inpainting with Deep Generative Models , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).