Arbitrary-Scale Image Synthesis

Positional encodings have enabled recent works to train a single adversarial network that can generate images of different scales. However, these approaches are either limited to a set of discrete scales or struggle to maintain good perceptual quality at the scales for which the model is not trained explicitly. We propose the design of scale-consistent positional encodings invariant to our generator's layers transformations. This enables the generation of arbitrary-scale images even at scales unseen during training. Moreover, we incorporate novel inter-scale augmentations into our pipeline and partial generation training to facilitate the synthesis of consistent images at arbitrary scales. Lastly, we show competitive results for a continuum of scales on various commonly used datasets for image synthesis.

[1]  L. Gool,et al.  Collapse by Conditioning: Training Class-conditional GANs with Limited Data , 2022, ICLR.

[2]  Zili Yi,et al.  DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editings , 2021, 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[3]  Jungbeom Lee,et al.  Toward Spatially Unbiased Generative Models , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[4]  Avinatan Hassidim,et al.  Explaining in Style: Training a GAN to explain a classifier in StyleSpace , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Jun-Yan Zhu,et al.  On Buggy Resizing Libraries and Surprising Subtleties in FID Calculation , 2021, ArXiv.

[6]  Mohamed Elhoseiny,et al.  Aligning Latent and Image Spaces to Connect the Unconnectable , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[7]  Sergey Tulyakov,et al.  InOut: Diverse Image Outpainting via GAN Inversion , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Daniel Cohen-Or,et al.  StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Luc Van Gool,et al.  Efficient Conditional GAN Transfer with Knowledge Propagation across Classes , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Xiaolong Wang,et al.  Learning Continuous Image Representation with Local Implicit Image Function , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Rui Xu,et al.  Positional Encoding as Spatial Inductive Bias in GANs , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Nenghai Yu,et al.  Efficient Semantic Image Synthesis via Class-Adaptive Normalization , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Bernt Schiele,et al.  You Only Need Adversarial Supervision for Semantic Image Synthesis , 2020, ICLR.

[14]  Victor Lempitsky,et al.  Image Generators with Conditionally-Independent Pixel Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Richard North,et al.  Andreas , 2020, The Longman Anthology of Old English, Old Icelandic and Anglo-Norman Literatures.

[16]  Mohamed Elhoseiny,et al.  Adversarial Generation of Continuous Images , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Raja Bala,et al.  Editing in Style: Uncovering the Local Semantics of GANs , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Luc Van Gool,et al.  SESAME: Semantic Editing of Scenes by Adding, Manipulating or Erasing Objects , 2020, ECCV.

[19]  Vladimir Ivashkin,et al.  StyleGAN2 Distillation for Feed-forward Image Manipulation , 2020, ECCV.

[20]  Jacek Tabor,et al.  LocoGAN - Locally Convolutional GAN , 2020, Comput. Vis. Image Underst..

[21]  Tero Karras,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[23]  Stefanos Zafeiriou,et al.  SliderGAN: Synthesizing Expressive Face Images by Sliding 3D Blendshape Parameters , 2019, International Journal of Computer Vision.

[24]  Lingyun Wu,et al.  MaskGAN: Towards Diverse and Interactive Facial Image Manipulation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Phillip Isola,et al.  On the "steerability" of generative adversarial networks , 2019, ICLR.

[26]  Seong Joon Oh,et al.  CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Jaakko Lehtinen,et al.  Improved Precision and Recall Metric for Assessing Generative Models , 2019, NeurIPS.

[28]  Peter Wonka,et al.  Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Wei Wei,et al.  COCO-GAN: Generation by Parts via Conditional Coordinating , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30]  Taesung Park,et al.  Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Luc Van Gool,et al.  SMIT: Stochastic Multi-Label Image-to-Image Translation , 2018, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[33]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[34]  Seunghoon Hong,et al.  Learning Hierarchical Semantic Image Manipulation through Structured Representations , 2018, NeurIPS.

[35]  Sebastian Nowozin,et al.  Which Training Methods for GANs do actually Converge? , 2018, ICML.

[36]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[38]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[39]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[40]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[41]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[42]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Aaron C. Courville,et al.  Generative Adversarial Networks , 2014, 1406.2661.

[44]  Sergey Tulyakov,et al.  InfinityGAN: Towards Infinite-Resolution Image Synthesis , 2021, ArXiv.