M ULTI - DOMAIN IMAGE GENERATION AND TRANSLA - TION WITH IDENTIFIABILITY GUARANTEES

Multi-domain image generation and unpaired image-to-to-image translation are two important and related computer vision problems. The common technique for the two tasks is the learning of a joint distribution from multiple marginal distributions. However, it is well known that there can be infinitely many joint distributions that can derive the same marginals. Hence, it is necessary to formulate suitable constraints to address this highly ill-posed problem. Inspired by the recent advances in nonlinear Independent Component Analysis (ICA) theory, we propose a new method to learn the joint distribution from the marginals by enforcing a specific type of minimal change across domains. We report one of the first results connecting multi-domain generative models to identifiability and shows why identifiability is essential and how to achieve it theoretically and practically. We apply our method to five multi-domain image generation and six image-toimage translation tasks. The superior performance of our model supports our theory and demonstrates the effectiveness of our method. The training code are available at https://github.com/Mid-Push/i-stylegan.

[1]  Daijin Kim,et al.  A Style-aware Discriminator for Controllable Image Translation , 2022, Computer Vision and Pattern Recognition.

[2]  Mingming Gong,et al.  Maximum Spatial Perturbation Consistency for Unpaired Image-to-Image Translation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  L. Gool,et al.  Collapse by Conditioning: Training Class-conditional GANs with Limited Data , 2022, ICLR.

[4]  Woohyeon Shim,et al.  Rebooting ACGAN: Auxiliary Classifier GANs with Stable Training , 2021, NeurIPS.

[5]  Zhenjiang Li,et al.  Breaking the Dilemma of Medical Image-to-image Translation , 2021, NeurIPS.

[6]  Mingming Gong,et al.  Unaligned Image-to-Image Translation by Learning to Reweight , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[7]  Jianmin Bao,et al.  Instance-wise Hard Negative Example Generation for Contrastive Learning in Unpaired Image-to-Image Translation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Luigi Gresele,et al.  Self-Supervised Learning with Data Augmentations Provably Isolates Content from Style , 2021, NeurIPS.

[9]  Nicu Sebe,et al.  Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-Image Translation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  M. Shoeiby,et al.  Dual Contrastive Learning for Unsupervised Image-to-Image Translation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[11]  Hung-Yu Tseng,et al.  Regularizing Generative Adversarial Networks under Limited Data , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Roland S. Zimmermann,et al.  Contrastive Learning Inverts the Data Generating Process , 2021, ICML.

[13]  Changyou Chen,et al.  Unpaired Image-to-Image Translation via Latent Energy Transport , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Alexei A. Efros,et al.  Contrastive Learning for Unpaired Image-to-Image Translation , 2020, ECCV.

[15]  Matthias Bethge,et al.  Towards Nonlinear Disentanglement in Natural Data with Temporal Sparse Coding , 2020, ICLR.

[16]  Jaesik Park,et al.  ContraGAN: Contrastive Learning for Conditional Image Generation , 2020, Neural Information Processing Systems.

[17]  Aapo Hyvarinen,et al.  Hidden Markov Nonlinear ICA: Unsupervised Learning from Nonstationary Time Series , 2020, UAI.

[18]  Tero Karras,et al.  Training Generative Adversarial Networks with Limited Data , 2020, NeurIPS.

[19]  Diederik P. Kingma,et al.  ICE-BeeM: Identifiable Conditional Energy-Based Deep Models , 2020, NeurIPS.

[20]  Honglak Lee,et al.  Improved Consistency Regularization for GANs , 2020, AAAI.

[21]  Ben Poole,et al.  Weakly-Supervised Disentanglement Without Compromises , 2020, ICML.

[22]  Jung-Woo Ha,et al.  StarGAN v2: Diverse Image Synthesis for Multiple Domains , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Tero Karras,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Mingkui Tan,et al.  Multi-marginal Wasserstein GAN , 2019, NeurIPS.

[25]  Honglak Lee,et al.  Consistency Regularization for Generative Adversarial Networks , 2019, ICLR.

[26]  Ben Poole,et al.  Weakly Supervised Disentanglement with Guarantees , 2019, ICLR.

[27]  Jianmin Jiang,et al.  Conditional Coupled Generative Adversarial Networks for Zero-Shot Domain Adaptation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Yan Wu,et al.  LOGAN: Latent Optimisation for Generative Adversarial Networks , 2019, ArXiv.

[29]  Thomas H. Li,et al.  Multi-mapping Image-to-Image Translation via Learning Disentanglement , 2019, NeurIPS.

[30]  Minjae Kim,et al.  U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation , 2019, ICLR.

[31]  Aapo Hyvärinen,et al.  Variational Autoencoders and Nonlinear ICA: A Unifying Framework , 2019, AISTATS.

[32]  Kun Zhang,et al.  Twin Auxilary Classifiers GAN , 2019, NeurIPS.

[33]  Jaakko Lehtinen,et al.  Few-Shot Unsupervised Image-to-Image Translation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  Maneesh Kumar Singh,et al.  DRIT++: Diverse Image-to-Image Translation via Disentangled Representations , 2019, International Journal of Computer Vision.

[35]  Bernhard Schölkopf,et al.  The Incomplete Rosetta Stone problem: Identifiability results for Multi-view Nonlinear ICA , 2019, UAI.

[36]  Jaakko Lehtinen,et al.  Improved Precision and Recall Metric for Assessing Generative Models , 2019, NeurIPS.

[37]  Siwei Ma,et al.  Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Bin Yang,et al.  Unsupervised Medical Image Translation Using Cycle-MedGAN , 2019, 2019 27th European Signal Processing Conference (EUSIPCO).

[39]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[40]  Yu-Chiang Frank Wang,et al.  A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation , 2018, NeurIPS.

[41]  Bin Yang,et al.  MedGAN: Medical Image Translation using GANs , 2018, Comput. Medical Imaging Graph..

[42]  Guoyin Wang,et al.  JointGAN: Multi-Domain Joint Distribution Learning with Generative Adversarial Nets , 2018, ICML.

[43]  Kwang In Kim,et al.  Unsupervised Attention-guided Image to Image Translation , 2018, NeurIPS.

[44]  Aapo Hyvärinen,et al.  Nonlinear ICA Using Auxiliary Variables and Generalized Contrastive Learning , 2018, AISTATS.

[45]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[46]  Qing Li,et al.  Unpaired Multi-Domain Image Generation via Regularized Conditional GANs , 2018, IJCAI.

[47]  Jan Kautz,et al.  Multimodal Unsupervised Image-to-Image Translation , 2018, ECCV.

[48]  Alexandre Lacoste,et al.  Neural Autoregressive Flows , 2018, ICML.

[49]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[50]  Takeru Miyato,et al.  cGANs with Projection Discriminator , 2018, ICLR.

[51]  David J. Kriegman,et al.  Image to Image Translation for Domain Adaptation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[52]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[53]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[54]  Aapo Hyvärinen,et al.  Nonlinear ICA of Temporally Dependent Stationary Sources , 2017, AISTATS.

[55]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[56]  Hyunsoo Kim,et al.  Learning to Discover Cross-Domain Relations with Generative Adversarial Networks , 2017, ICML.

[57]  Jan Kautz,et al.  Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[58]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[60]  Ming-Yu Liu,et al.  Coupled Generative Adversarial Networks , 2016, NIPS.

[61]  Aapo Hyvärinen,et al.  Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA , 2016, NIPS.

[62]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[63]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[64]  Aaron C. Courville,et al.  Generative Adversarial Nets , 2014, NIPS.

[65]  T. Lindvall Lectures on the Coupling Method , 1992 .

[66]  Shaoan Xie,et al.  Partial disentanglement for domain adaptation , 2022, ICML.

[67]  Qirong Ho,et al.  Unsupervised Image-to-Image Translation with Density Changing Regularization , 2022, NeurIPS.

[68]  Petra Theunissen,et al.  Conference Paper , 2009 .

[69]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.