InvGAN: Invertible GANs

Generation of photo-realistic images, semantic editing and representation learning are a few of many potential applications of high resolution generative models. Recent progress in GANs have established them as an excellent choice for such tasks. However, since they do not provide an inference model, image editing or downstream tasks such as classification can not be done on real images using the GAN latent space. Despite numerous efforts to train an inference model or design an iterative method to invert a pre-trained generator, previous methods are dataset (e.g. human face images) and architecture (e.g. StyleGAN) specific. These methods are nontrivial to extend to novel datasets or architectures. We propose a general framework that is agnostic to architecture and datasets. Our key insight is that, by training the inference and the generative model together, we allow them to adapt to each other and to converge to a better quality model. Our InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model. This allows us to perform image inpainting, merging, interpolation and online data augmentation. We demonstrate this with extensive qualitative and quantitative experiments.

[1]  Jaakko Lehtinen,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Pietro Perona,et al.  Towards causal benchmarking of bias in face analysis algorithms , 2020, ECCV.

[3]  Wei Wei,et al.  COCO-GAN: Generation by Parts via Conditional Coordinating , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[4]  Deli Zhao,et al.  In-Domain GAN Inversion for Real Image Editing , 2020, ECCV.

[5]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[6]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[7]  P. Alam ‘A’ , 2021, Composites Engineering: An A–Z Guide.

[8]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[9]  Bogdan Raducanu,et al.  Invertible Conditional GANs for image editing , 2016, ArXiv.

[10]  Michael J. Black,et al.  Resisting Adversarial Attacks using Gaussian Mixture Variational Autoencoders , 2018, AAAI.

[11]  Trevor Darrell,et al.  Contrastive Examples for Addressing the Tyranny of the Majority , 2020, ArXiv.

[12]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Peter Wonka,et al.  Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Artem Babenko,et al.  Unsupervised Discovery of Interpretable Directions in the GAN Latent Space , 2020, ICML.

[15]  Bernhard Schölkopf,et al.  Wasserstein Auto-Encoders , 2017, ICLR.

[16]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[17]  Stanislav Pidhorskyi,et al.  Adversarial Latent Autoencoders , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Bolei Zhou,et al.  Semantic photo manipulation with a generative image prior , 2019, ACM Trans. Graph..

[19]  Peter Wonka,et al.  Image2StyleGAN++: How to Edit the Embedded Images? , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[21]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[22]  Rewon Child Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images , 2021, ICLR.

[23]  Bernhard Schölkopf,et al.  From Variational to Deterministic Autoencoders , 2019, ICLR.

[24]  Olga Russakovsky,et al.  Fair Attribute Classification through Latent Space De-biasing , 2020, ArXiv.

[25]  Ali Razavi,et al.  Generating Diverse High-Fidelity Images with VQ-VAE-2 , 2019, NeurIPS.

[26]  Bingbing Ni,et al.  Collaborative Learning for Faster StyleGAN Embedding , 2020, ArXiv.

[27]  Bo Zhang,et al.  LIA: Latently Invertible Autoencoder with Adversarial Learning , 2019, ArXiv.

[28]  Claus Aranha,et al.  Data Augmentation Using GANs , 2019, ArXiv.

[29]  Yu Cheng,et al.  Sequential Attention GAN for Interactive Image Editing , 2018, ACM Multimedia.

[30]  Antonio Torralba,et al.  Improving Inversion and Generation Diversity in StyleGAN using a Gaussianized Latent Space , 2020, ArXiv.

[31]  Anurag Ranjan,et al.  GIF: Generative Interpretable Faces , 2020, 2020 International Conference on 3D Vision (3DV).

[32]  W. Marsden I and J , 2012 .

[33]  Yong-Liang Yang,et al.  HoloGAN: Unsupervised Learning of 3D Representations From Natural Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[34]  Bolei Zhou,et al.  Generative Hierarchical Features from Synthesizing Images , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  O. Bousquet,et al.  From optimal transport to generative modeling: the VEGAN cookbook , 2017, 1705.07642.

[37]  Daniel Cohen-Or,et al.  ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[38]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[39]  Bernhard Schölkopf,et al.  Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[40]  Nenghai Yu,et al.  A Simple Baseline for StyleGAN Inversion , 2021, ArXiv.

[41]  Jeff Donahue,et al.  Large Scale Adversarial Representation Learning , 2019, NeurIPS.

[42]  Mark Chen,et al.  Generative Pretraining From Pixels , 2020, ICML.

[43]  Alexei A. Efros,et al.  Generative Visual Manipulation on the Natural Image Manifold , 2016, ECCV.

[44]  Bolei Zhou,et al.  GAN Inversion: A Survey , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Mubarak Shah,et al.  UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[46]  Daniel Cohen-Or,et al.  Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Christian Theobalt,et al.  StyleRig: Rigging StyleGAN for 3D Control Over Portrait Images , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Subarna Tripathi,et al.  Precise Recovery of Latent Vectors from Generative Adversarial Networks , 2017, ICLR.

[49]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[50]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[51]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[52]  Kush R. Varshney,et al.  Fairness GAN , 2018, IBM J. Res. Dev..