Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly

Training generative adversarial networks (GANs) with limited data generally results in deteriorated performance and collapsed models. To conquer this challenge, we are inspired by the latest observation of Kalibhat et al. (2020); Chen et al. (2021d), that one can discover independently trainable and highly sparse subnetworks (a.k.a., lottery tickets) from GANs. Treating this as an inductive prior, we decompose the data-hungry GAN training into two sequential sub-problems: (i) identifying the lottery ticket from the original GAN; then (ii) training the found sparse subnetwork with aggressive data and feature augmentations. Both sub-problems re-use the same small training set of real images. Such a coordinated framework enables us to focus on lower-complexity and more data-efficient sub-problems, effectively stabilizing training and improving convergence. Comprehensive experiments endorse the effectiveness of our proposed ultra-data-efficient training framework, across various GAN architectures (SNGAN, BigGAN, and StyleGAN2) and diverse datasets (CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet). Besides, our training framework also displays powerful few-shot generalization ability, i.e., generating high-fidelity images by training from scratch with just 100 real images, without any pre-training. Codes are available at: https://github.com/VITA-Group/ Ultra-Data-Efficient-GAN-Training.

[1]  Jinwoo Shin,et al.  Freeze the Discriminator: a Simple Baseline for Fine-Tuning GANs , 2020, 2002.10964.

[2]  Ngai-Man Cheung,et al.  Towards Good Practices for Data Augmentation in GAN Training , 2020, ArXiv.

[3]  Jaakko Lehtinen,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Bernt Schiele,et al.  Disentangling Adversarial Robustness and Generalization , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Honglak Lee,et al.  Improved Consistency Regularization for GANs , 2021, AAAI.

[6]  Xiaohua Zhai,et al.  Self-Supervised GANs via Auxiliary Rotation Loss , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Quoc V. Le,et al.  Adversarial Examples Improve Image Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Ting Chen,et al.  Robust Pre-Training by Adversarial Contrastive Learning , 2020, NeurIPS.

[9]  Tianlong Chen,et al.  GANs Can Play Lottery Tickets Too , 2021, ICLR.

[10]  Yu Cheng,et al.  Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Sebastian Nowozin,et al.  Which Training Methods for GANs do actually Converge? , 2018, ICML.

[12]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[14]  Mauricio J. Serrano,et al.  The Sooner The Better: Investigating Structure of Early Winning Lottery Tickets , 2019 .

[15]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[16]  Michael I. Jordan,et al.  Theoretically Principled Trade-off between Robustness and Accuracy , 2019, ICML.

[17]  Yu Cheng,et al.  Large-Scale Adversarial Training for Vision-and-Language Representation Learning , 2020, NeurIPS.

[18]  Michael Carbin,et al.  The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.

[19]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[20]  Chao Xu,et al.  Distilling portable Generative Adversarial Networks for Image Translation , 2020, AAAI.

[21]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[22]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[23]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[24]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[25]  Yoshua Bengio,et al.  Small-GAN: Speeding Up GAN Training Using Core-sets , 2019, ICML.

[26]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[27]  Xiaohua Zhai,et al.  High-Fidelity Image Generation With Fewer Labels , 2019, ICML.

[28]  Colin Wei,et al.  Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin , 2019, ICLR.

[29]  Erich Elsen,et al.  The Difficulty of Training Sparse Neural Networks , 2019, ArXiv.

[30]  Tianlong Chen,et al.  Once-for-All Adversarial Training: In-Situ Tradeoff between Robustness and Accuracy for Free , 2020, NeurIPS.

[31]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[32]  Soheil Feizi,et al.  Winning Lottery Tickets in Deep Generative Models , 2021, AAAI.

[33]  Zhe Gan,et al.  EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets , 2020, ACL.

[34]  David J. Schwab,et al.  The Early Phase of Neural Network Training , 2020, ICLR.

[35]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Erich Elsen,et al.  The State of Sparsity in Deep Neural Networks , 2019, ArXiv.

[37]  Jaesik Park,et al.  ContraGAN: Contrastive Learning for Conditional Image Generation , 2020, Neural Information Processing Systems.

[38]  David Bau,et al.  Diverse Image Generation via Self-Conditioned GANs , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Shrey Desai,et al.  Evaluating Lottery Tickets Under Distributional Shifts , 2019, EMNLP.

[40]  Zhangyang Wang,et al.  A Unified Lottery Ticket Hypothesis for Graph Neural Networks , 2021, ICML.

[41]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[42]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[43]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[44]  Dan Zhang,et al.  PA-GAN: Improving GAN Training by Progressive Augmentation , 2019, ArXiv.

[45]  Zhiqiang Shen,et al.  Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[46]  Jiayu Wu,et al.  Tiny ImageNet Challenge , 2017 .

[47]  Takeru Miyato,et al.  cGANs with Projection Discriminator , 2018, ICLR.

[48]  Yue Wang,et al.  Drawing early-bird tickets: Towards more efficient training of deep networks , 2019, ICLR.

[49]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[50]  Zhangyang Wang,et al.  Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning , 2021, ICLR.

[51]  Song Han,et al.  Differentiable Augmentation for Data-Efficient GAN Training , 2020, NeurIPS.

[52]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[53]  Shiyu Chang,et al.  The Lottery Ticket Hypothesis for Pre-trained BERT Networks , 2020, NeurIPS.

[54]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[55]  Yu Cheng,et al.  FreeLB: Enhanced Adversarial Training for Natural Language Understanding , 2020, ICLR.

[56]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[57]  Roger B. Grosse,et al.  Picking Winning Tickets Before Training by Preserving Gradient Flow , 2020, ICLR.

[58]  Shiyu Chang,et al.  AutoGAN: Neural Architecture Search for Generative Adversarial Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[59]  Gintare Karolina Dziugaite,et al.  Linear Mode Connectivity and the Lottery Ticket Hypothesis , 2019, ICML.

[60]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[61]  Bogdan Raducanu,et al.  Transferring GANs: generating images from limited data , 2018, ECCV.

[62]  Tero Karras,et al.  Training Generative Adversarial Networks with Limited Data , 2020, NeurIPS.

[63]  Léon Bottou,et al.  Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[64]  Aleksander Madry,et al.  Adversarially Robust Generalization Requires More Data , 2018, NeurIPS.

[65]  Dilin Wang,et al.  Improving Neural Language Modeling via Adversarial Training , 2019, ICML.

[66]  Yuandong Tian,et al.  Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP , 2019, ICLR.

[67]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[68]  Mingjie Sun,et al.  Rethinking the Value of Network Pruning , 2018, ICLR.

[69]  Terrance DeVries,et al.  Instance Selection for GANs , 2020, NeurIPS.

[70]  Tianlong Chen,et al.  Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by Enabling Input-Adaptive Inference , 2020, ICLR.

[71]  Zhenan Sun,et al.  A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications , 2020, IEEE Transactions on Knowledge and Data Engineering.

[72]  Tatsuya Harada,et al.  Image Generation From Small Datasets via Batch Statistics Adaptation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[73]  Shiyu Chang,et al.  The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[74]  Song-Chun Zhu,et al.  Learning Hybrid Image Templates (HIT) by Information Projection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[75]  Zhangyang Wang,et al.  DeblurGAN-v2: Deblurring (Orders-of-Magnitude) Faster and Better , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[76]  Aleksander Madry,et al.  Robustness May Be at Odds with Accuracy , 2018, ICLR.

[77]  Aditi Raghunathan,et al.  Adversarial Training Can Hurt Generalization , 2019, ArXiv.

[78]  Zhanxing Zhu,et al.  Towards Understanding Adversarial Examples Systematically: Exploring Data Size, Task and Model Factors , 2019, ArXiv.

[79]  M. Maire,et al.  Winning the Lottery with Continuous Sparsification , 2019, NeurIPS.

[80]  Honglak Lee,et al.  Consistency Regularization for Generative Adversarial Networks , 2020, ICLR.

[81]  Preetum Nakkiran,et al.  Adversarial Robustness May Be at Odds With Simplicity , 2019, ArXiv.

[82]  Ding Liu,et al.  EnlightenGAN: Deep Light Enhancement Without Paired Supervision , 2019, IEEE Transactions on Image Processing.

[83]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[84]  Fahad Shahbaz Khan,et al.  MineGAN: Effective Knowledge Transfer From GANs to Target Domains With Few Images , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[85]  Michael Carbin,et al.  Comparing Rewinding and Fine-tuning in Neural Network Pruning , 2019, ICLR.

[86]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).