Collapse by Conditioning: Training Class-conditional GANs with Limited Data

Class-conditioning offers a direct means of controlling a Generative Adversarial Network (GAN) based on a discrete input variable. While necessary in many applications, the additional information provided by the class labels could even be expected to benefit the training of the GAN itself. Contrary to this belief, we observe that class-conditioning causes mode collapse in limited data settings, where unconditional learning leads to satisfactory generative ability. Motivated by this observation, we propose a training strategy for conditional GANs (cGANs) that effectively prevents the observed mode-collapse by leveraging unconditional learning. Our training strategy starts with an unconditional GAN and gradually injects conditional information into the generator and the objective function. The proposed method for training cGANs with limited data results not only in stable training but also in generating high-quality images, thanks to the early-stage exploitation of the shared information across classes. We analyze the aforementioned mode collapse problem in comprehensive experiments on four datasets. Our approach demonstrates outstanding results compared with state-of-the-art methods and established baselines. The code is available at https://github.com/mshahbazi72/transitional-cGAN

[1]  Matthieu Guillaumin,et al.  Food-101 - Mining Discriminative Components with Random Forests , 2014, ECCV.

[3]  Hung-Yu Tseng,et al.  Regularizing Generative Adversarial Networks under Limited Data , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[5]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[6]  Song Han,et al.  Differentiable Augmentation for Data-Efficient GAN Training , 2020, NeurIPS.

[7]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[8]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[9]  Luc Van Gool,et al.  Efficient Conditional GAN Transfer with Knowledge Propagation across Classes , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[11]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Arthur Gretton,et al.  Demystifying MMD GANs , 2018, ICLR.

[13]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[14]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[15]  Takeru Miyato,et al.  cGANs with Projection Discriminator , 2018, ICLR.

[16]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[17]  Yizhe Zhu,et al.  Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis , 2021, ICLR.

[18]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[19]  Yong Yu,et al.  Activation Maximization Generative Adversarial Nets , 2017 .

[20]  Jaakko Lehtinen,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Tero Karras,et al.  Training Generative Adversarial Networks with Limited Data , 2020, NeurIPS.

[22]  L. Gool,et al.  SRFlow: Learning the Super-Resolution Space with Normalizing Flow , 2020, ECCV.

[23]  Lawrence Carin,et al.  On Leveraging Pretrained GANs for Limited-Data Generation , 2020, ICML 2020.

[24]  Jaakko Lehtinen,et al.  Improved Precision and Recall Metric for Assessing Generative Models , 2019, NeurIPS.

[25]  Tatsuya Harada,et al.  Image Generation From Small Datasets via Batch Statistics Adaptation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[27]  Zhe Gan,et al.  AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Ross B. Girshick,et al.  LVIS: A Dataset for Large Vocabulary Instance Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Jinwoo Shin,et al.  Freeze the Discriminator: a Simple Baseline for Fine-Tuning GANs , 2020, 2002.10964.

[30]  Song-Chun Zhu,et al.  Learning Hybrid Image Templates (HIT) by Information Projection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Jaesik Park,et al.  ContraGAN: Contrastive Learning for Conditional Image Generation , 2020, Neural Information Processing Systems.

[33]  Fahad Shahbaz Khan,et al.  MineGAN: Effective Knowledge Transfer From GANs to Target Domains With Few Images , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Jaakko Lehtinen,et al.  Few-Shot Unsupervised Image-to-Image Translation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35]  Bogdan Raducanu,et al.  Transferring GANs: generating images from limited data , 2018, ECCV.