论文信息 - Arbitrary-Scale Image Synthesis

Arbitrary-Scale Image Synthesis

Positional encodings have enabled recent works to train a single adversarial network that can generate images of different scales. However, these approaches are either limited to a set of discrete scales or struggle to maintain good perceptual quality at the scales for which the model is not trained explicitly. We propose the design of scale-consistent positional encodings invariant to our generator's layers transformations. This enables the generation of arbitrary-scale images even at scales unseen during training. Moreover, we incorporate novel inter-scale augmentations into our pipeline and partial generation training to facilitate the synthesis of consistent images at arbitrary scales. Lastly, we show competitive results for a continuum of scales on various commonly used datasets for image synthesis.

[1] L. Gool,et al. Collapse by Conditioning: Training Class-conditional GANs with Limited Data , 2022, ICLR.

[2] Zili Yi,et al. DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editings , 2021, 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[3] Jungbeom Lee,et al. Toward Spatially Unbiased Generative Models , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[4] Avinatan Hassidim,et al. Explaining in Style: Training a GAN to explain a classifier in StyleSpace , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5] Jun-Yan Zhu,et al. On Buggy Resizing Libraries and Surprising Subtleties in FID Calculation , 2021, ArXiv.

[6] Mohamed Elhoseiny,et al. Aligning Latent and Image Spaces to Connect the Unconnectable , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[7] Sergey Tulyakov,et al. InOut: Diverse Image Outpainting via GAN Inversion , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Daniel Cohen-Or,et al. StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[9] Luc Van Gool,et al. Efficient Conditional GAN Transfer with Knowledge Propagation across Classes , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Xiaolong Wang,et al. Learning Continuous Image Representation with Local Implicit Image Function , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Rui Xu,et al. Positional Encoding as Spatial Inductive Bias in GANs , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Nenghai Yu,et al. Efficient Semantic Image Synthesis via Class-Adaptive Normalization , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13] Bernt Schiele,et al. You Only Need Adversarial Supervision for Semantic Image Synthesis , 2020, ICLR.

[14] Victor Lempitsky,et al. Image Generators with Conditionally-Independent Pixel Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Richard North,et al. Andreas , 2020, The Longman Anthology of Old English, Old Icelandic and Anglo-Norman Literatures.

[16] Mohamed Elhoseiny,et al. Adversarial Generation of Continuous Images , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Raja Bala,et al. Editing in Style: Uncovering the Local Semantics of GANs , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Luc Van Gool,et al. SESAME: Semantic Editing of Scenes by Adding, Manipulating or Erasing Objects , 2020, ECCV.

[19] Vladimir Ivashkin,et al. StyleGAN2 Distillation for Feed-forward Image Manipulation , 2020, ECCV.

[20] Jacek Tabor,et al. LocoGAN - Locally Convolutional GAN , 2020, Comput. Vis. Image Underst..

[21] Tero Karras,et al. Analyzing and Improving the Image Quality of StyleGAN , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[23] Stefanos Zafeiriou,et al. SliderGAN: Synthesizing Expressive Face Images by Sliding 3D Blendshape Parameters , 2019, International Journal of Computer Vision.

[24] Lingyun Wu,et al. MaskGAN: Towards Diverse and Interactive Facial Image Manipulation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Phillip Isola,et al. On the "steerability" of generative adversarial networks , 2019, ICLR.

[26] Seong Joon Oh,et al. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27] Jaakko Lehtinen,et al. Improved Precision and Recall Metric for Assessing Generative Models , 2019, NeurIPS.

[28] Peter Wonka,et al. Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29] Wei Wei,et al. COCO-GAN: Generation by Parts via Conditional Coordinating , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30] Taesung Park,et al. Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Timo Aila,et al. A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Luc Van Gool,et al. SMIT: Stochastic Multi-Label Image-to-Image Translation , 2018, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[33] Jeff Donahue,et al. Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[34] Seunghoon Hong,et al. Learning Hierarchical Semantic Image Manipulation through Structured Representations , 2018, NeurIPS.

[35] Sebastian Nowozin,et al. Which Training Methods for GANs do actually Converge? , 2018, ICML.

[36] Alexei A. Efros,et al. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37] Jaakko Lehtinen,et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[38] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[39] Alexei A. Efros,et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[40] Bernt Schiele,et al. Generative Adversarial Text to Image Synthesis , 2016, ICML.

[41] Yinda Zhang,et al. LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[42] Jian Sun,et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43] Aaron C. Courville,et al. Generative Adversarial Networks , 2014, 1406.2661.

[44] Sergey Tulyakov,et al. InfinityGAN: Towards Infinite-Resolution Image Synthesis , 2021, ArXiv.