Perceptual-DualGAN: Perceptual Losses for Image to Image Translation with Generative Adversarial Nets

Thinkingabout cross-domain image-to-image translation problems, where an input image belonging to domain U is transformed into an output image belonging to another domain V. A series of typical tasks, such as style transformation, colorization, super-resolution, can be seen as cross-domain image-to-image translation tasks. Recent methods such as Conditional Generative Adversarial Networks (cGANs) make big progress in this field, but they require paired image data, which is hard to obtain. The DualGAN (Unsupervised Dual Learning for Image-to-Image Translation) architecture was proposed to solve the issue of lack of paired data. But the pixel-level reconstruction losses of DualGAN are simple. In this paper, we replace the pixel-level reconstruction losses with the perceptual reconstruction losses, and propose a more advanced framework for cross-domain image-to-image translation named perceptual-DualGAN. The perceptual reconstruction losses consist of feature reconstruction losses and style reconstruction losses, both of them are computed from pretrained loss networks. Experiments on multiple image translation tasks show that our framework almost performs superior to other methods. And the results of experiments illustrate that our framework can generate more realistic and more natural photos.

[1]  王晓刚,et al.  Coupled Information-Theoretic Encoding for Face Photo-Sketch Recognition , 2011 .

[2]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[3]  Andrea Vedaldi,et al.  Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Leon A. Gatys,et al.  Texture Synthesis Using Convolutional Neural Networks , 2015, NIPS.

[6]  Tie-Yan Liu,et al.  Dual Learning for Machine Translation , 2016, NIPS.

[7]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[9]  Chuan Li,et al.  Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks , 2016, ECCV.

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[12]  Radim Sára,et al.  Spatial Pattern Templates for Recognition of Objects with Regular Structure , 2013, GCPR.

[13]  Leon A. Gatys,et al.  A Neural Algorithm of Artistic Style , 2015, ArXiv.

[14]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Pierre Vandergheynst,et al.  Beyond bits: Reconstructing images from Local Binary Descriptors , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[16]  David Berthelot,et al.  BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[17]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[18]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[19]  Bin Sheng,et al.  Deep Colorization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[21]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[22]  Edward H. Adelson,et al.  Material perception: What can you see in a brief glance? , 2010 .

[23]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[24]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[25]  Abhinav Gupta,et al.  Generative Image Modeling Using Style and Structure Adversarial Networks , 2016, ECCV.

[26]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[27]  Andrea Vedaldi,et al.  Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[29]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[30]  Amit R.Sharma,et al.  Face Photo-Sketch Synthesis and Recognition , 2012 .