论文信息 - Instance-Conditioned GAN

Instance-Conditioned GAN

Generative Adversarial Networks (GANs) can generate near photo realistic images in narrow domains such as human faces. Yet, modeling complex distributions of datasets such as ImageNet and COCO-Stuff remains challenging in unconditional settings. In this paper, we take inspiration from kernel density estimation techniques and introduce a non-parametric approach to modeling distributions of complex datasets. We partition the data manifold into a mixture of overlapping neighborhoods described by a datapoint and its nearest neighbors, and introduce a model, called instance-conditioned GAN (IC-GAN), which learns the distribution around each datapoint. Experimental results on ImageNet and COCO-Stuff show that IC-GAN significantly improves over unconditional models and unsupervised data partitioning baselines. Moreover, we show that IC-GAN can effortlessly transfer to datasets not seen during training by simply changing the conditioning instances, and still generate realistic images. Finally, we extend IC-GAN to the class-conditional case and show semantically controllable generation and competitive quantitative results on ImageNet; while improving over BigGAN on ImageNet-LT. Code and trained models to reproduce the reported results are available at https://github.com/facebookresearch/ic_gan.

[1] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[2] Tero Karras,et al. Training Generative Adversarial Networks with Limited Data , 2020, NeurIPS.

[3] Anton van den Hengel,et al. A Generative Adversarial Density Estimator , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Yongxin Yang,et al. Deeper, Broader and Artier Domain Generalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[5] Gerhard Widmer,et al. Mixture Density Generative Adversarial Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6] R Devon Hjelm,et al. Object-Centric Image Generation from Layouts , 2021, AAAI.

[7] Bo Zhao,et al. Image Generation From Layout , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Yann Ollivier,et al. Mixed batches and symmetric discriminators for GAN training , 2018, ICML.

[9] Wei Sun,et al. Image Synthesis From Reconfigurable Layout and Style , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[10] Jeff Johnson,et al. Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[11] Han Zhang,et al. Self-Attention Generative Adversarial Networks , 2018, ICML.

[12] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Terrance DeVries,et al. Instance Selection for GANs , 2020, NeurIPS.

[14] Jaakko Lehtinen,et al. Improved Precision and Recall Metric for Assessing Generative Models , 2019, NeurIPS.

[15] Jan Kautz,et al. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16] Marcus Rohrbach,et al. Decoupling Representation and Classifier for Long-Tailed Recognition , 2020, ICLR.

[17] Marc Alexa,et al. How do humans sketch objects? , 2012, ACM Trans. Graph..

[18] Wei Sun,et al. Learning Layout and Style Reconfigurable GANs for Controllable Image Synthesis , 2020, ArXiv.

[19] Lior Wolf,et al. Specifying Object Attributes and Relations in Interactive Scene Generation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20] Seong Joon Oh,et al. Reliable Fidelity and Diversity Metrics for Generative Models , 2020, ICML.

[21] Cristian Canton-Ferrer,et al. The Deepfake Detection Challenge (DFDC) Preview Dataset , 2019, ArXiv.

[22] Jaakko Lehtinen,et al. Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Jeff Donahue,et al. Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[24] David Pfau,et al. Unrolled Generative Adversarial Networks , 2016, ICLR.

[25] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.

[26] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.

[27] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[28] Vladlen Koltun,et al. Semi-Parametric Image Synthesis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29] Mehdi Noroozi. Self-labeled Conditional GANs , 2020, ArXiv.

[30] Li Fei-Fei,et al. Image Generation from Scene Graphs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31] Matthieu Cord,et al. Grafit: Learning fine-grained image representations with coarse labels , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[32] Julien Mairal,et al. Unsupervised Learning of Visual Features by Contrasting Cluster Assignments , 2020, NeurIPS.

[33] Vladlen Koltun,et al. Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[35] Trung Le,et al. MGAN: Training Generative Adversarial Nets with Multiple Generators , 2018, ICLR.

[36] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[37] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Taesung Park,et al. Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Michal Drozdzal,et al. Generating unseen complex scenes: are we there yet? , 2020, ArXiv.

[40] Simon Osindero,et al. Conditional Generative Adversarial Nets , 2014, ArXiv.

[41] Vineeth N Balasubramanian,et al. Data InStance Prior (DISP) in Generative Adversarial Networks , 2020, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[42] Qi Liu,et al. SketchyCOCO: Image Generation From Freehand Scene Sketches , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43] David Bau,et al. Diverse Image Generation via Self-Conditioned GANs , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45] Timo Aila,et al. A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46] Vittorio Ferrari,et al. COCO-Stuff: Thing and Stuff Classes in Context , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47] Pablo M. Granitto,et al. Class-Splitting Generative Adversarial Networks , 2017, ArXiv.

[48] Luc Van Gool,et al. Logo Synthesis and Manipulation with Clustered Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[49] Alexei A. Efros,et al. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50] Prafulla Dhariwal,et al. Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.

[51] Philip H. S. Torr,et al. Multi-agent Diverse Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[52] Nicu Sebe,et al. Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[53] Stella X. Yu,et al. Large-Scale Long-Tailed Recognition in an Open World , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[54] Xiaohua Zhai,et al. High-Fidelity Image Generation With Fewer Labels , 2019, ICML.

[55] Ashish Khetan,et al. PacGAN: The Power of Two Samples in Generative Adversarial Networks , 2017, IEEE Journal on Selected Areas in Information Theory.

[56] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.