Multi-Mapping Image-to-Image Translation with Central Biasing Normalization

Recent advances in image-to-image translation have seen a rise in approaches generating diverse images through a single network. To indicate the target domain for a one-to-many mapping, the latent code is injected into the generator network. However, we found that the injection method leads to mode collapse because of normalization strategies. Existing normalization strategies might either cause the inconsistency of feature distribution or eliminate the effect of the latent code. To solve these problems, we propose the consistency within diversity criteria for designing the multi-mapping model. Based on the criteria, we propose central biasing normalization to inject the latent code information. Experiments show that our method can improve the quality and diversity of existing image-to-image translation models, such as StarGAN, BicycleGAN, and pix2pix.

[1]  Andrea Vedaldi,et al.  Improved Texture Networks: Maximizing Quality and Diversity in Feed-Forward Stylization and Texture Synthesis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Thomas H. Li,et al.  StructureFlow: Image Inpainting via Structure-Aware Appearance Flow , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[3]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Ian J. Goodfellow,et al.  NIPS 2016 Tutorial: Generative Adversarial Networks , 2016, ArXiv.

[5]  Alexei A. Efros,et al.  Colorful Image Colorization , 2016, ECCV.

[6]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[7]  Lior Wolf,et al.  Unsupervised Cross-Domain Image Generation , 2016, ICLR.

[8]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[9]  Jiangjiang Liu,et al.  Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground , 2018, ECCV.

[10]  Xiaofeng Tao,et al.  Transient attributes for high-level understanding and editing of outdoor scenes , 2014, ACM Trans. Graph..

[11]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[12]  Kristen Grauman,et al.  Fine-Grained Visual Comparisons with Local Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Ole Winther,et al.  Autoencoding beyond pixels using a learned similarity metric , 2015, ICML.

[14]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[15]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[16]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[17]  Radim Sára,et al.  Spatial Pattern Templates for Recognition of Objects with Regular Structure , 2013, GCPR.

[18]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[19]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[21]  Hyunsoo Kim,et al.  Learning to Discover Cross-Domain Relations with Generative Adversarial Networks , 2017, ICML.

[22]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Qingyao Wu,et al.  Cartoon-to-Photo Facial Translation with Generative Adversarial Networks , 2018, ACML.

[24]  Kwang In Kim,et al.  Unsupervised Attention-guided Image to Image Translation , 2018, NeurIPS.

[25]  Ling Shao,et al.  Video Co-Saliency Guided Co-Segmentation , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[26]  Yang Cao,et al.  Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[29]  Ming-Yu Liu,et al.  Coupled Generative Adversarial Networks , 2016, NIPS.

[30]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[31]  Jason Cong,et al.  Minimizing Computation in Convolutional Neural Networks , 2014, ICANN.

[32]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[33]  Tom White,et al.  Generative Adversarial Networks: An Overview , 2017, IEEE Signal Processing Magazine.

[34]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[35]  Yu Tian,et al.  CR-GAN: Learning Complete Representations for Multi-view Generation , 2018, IJCAI.

[36]  Jan Kautz,et al.  Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[37]  Xinbo Gao,et al.  Face Sketch Synthesis via Sparse Representation-Based Greedy Search , 2015, IEEE Transactions on Image Processing.

[38]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[39]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[40]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[41]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[42]  Rongrong Ji,et al.  Scoot: A Perceptual Metric for Facial Sketches , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[43]  Andrea Vedaldi,et al.  Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[44]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Chao Yang,et al.  Show, Attend, and Translate: Unsupervised Image Translation With Self-Regularization and Attention , 2018, IEEE Transactions on Image Processing.

[46]  Alexei A. Efros,et al.  Generative Visual Manipulation on the Natural Image Manifold , 2016, ECCV.

[47]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[48]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[49]  Xing Cai,et al.  SingleGAN: Image-to-Image Translation by a Single-Generator Network using Multiple Generative Adversarial Learning , 2018, ACCV.

[50]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[51]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Ming-Ming Cheng,et al.  EGNet: Edge Guidance Network for Salient Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[54]  Lior Wolf,et al.  One-Sided Unsupervised Domain Mapping , 2017, NIPS.

[55]  Guillaume Lample,et al.  Fader Networks: Manipulating Images by Sliding Attributes , 2017, NIPS.

[56]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[57]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[58]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[59]  Huaiqin Wu Global stability analysis of a general class of discontinuous neural networks with linear growth activation functions , 2009, Inf. Sci..

[60]  Andrew Brock,et al.  Neural Photo Editing with Introspective Adversarial Networks , 2016, ICLR.

[61]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[62]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[63]  Alexei A. Efros,et al.  Toward Multimodal Image-to-Image Translation , 2017, NIPS.

[64]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.