EditGAN: High-Precision Semantic Image Editing

Generative adversarial networks (GANs) have recently found applications in image editing. However, most GAN-based image editing methods often require large-scale datasets with semantic segmentation annotations for training, only provide high level control, or merely interpolate between different images. Here, we propose EditGAN, a novel method for high-quality, high-precision semantic image editing, allowing users to edit images by modifying their highly detailed part segmentation masks, e.g., drawing a new mask for the headlight of a car. EditGAN builds on a GAN framework that jointly models images and their semantic segmentations [1, 2], requiring only a handful of labeled examples – making it a scalable tool for editing. Specifically, we embed an image into the GAN’s latent space and perform conditional latent code optimization according to the segmentation edit, which effectively also modifies the image. To amortize optimization, we find “editing vectors” in latent space that realize the edits. The framework allows us to learn an arbitrary number of editing vectors, which can then be directly applied on other images at interactive rates. We experimentally show that EditGAN can manipulate images with an unprecedented level of detail and freedom, while preserving full image quality.We can also easily combine multiple edits and perform plausible edits beyond EditGAN’s training data. We demonstrate EditGAN on a wide variety of image types and quantitatively outperform several previous editing methods on standard editing benchmark tasks. Project page: https://nv-tlabs.github.io/editGAN.

[1]  Raja Bala,et al.  Editing in Style: Uncovering the Local Semantics of GANs , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Artem Babenko,et al.  Unsupervised Discovery of Interpretable Directions in the GAN Latent Space , 2020, ICML.

[3]  Patrick Pérez,et al.  Poisson image editing , 2003, ACM Trans. Graph..

[4]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[5]  Pietro Perona,et al.  Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Sungjoo Yoo,et al.  Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Sylvain Paris,et al.  Error-Tolerant Image Compositing , 2010, International Journal of Computer Vision.

[8]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[9]  Hye Won Chung,et al.  Self-Diagnosing GAN: Diagnosing Underrepresented Samples in Generative Adversarial Networks , 2021, ArXiv.

[10]  Yisroel Mirsky,et al.  The Creation and Detection of Deepfakes , 2020, ACM Comput. Surv..

[11]  Bolei Zhou,et al.  Closed-Form Factorization of Latent Semantics in GANs , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Sanja Fidler,et al.  Semantic Segmentation with Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Lin Gao,et al.  DeepFaceDrawing: deep generation of face images from sketches , 2020, ACM Trans. Graph..

[14]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[15]  Anil A. Bharath,et al.  Inverting the Generator of a Generative Adversarial Network , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[17]  Bolei Zhou,et al.  Semantic photo manipulation with a generative image prior , 2019, ACM Trans. Graph..

[18]  Changxi Zheng,et al.  Linear Semantics in Generative Adversarial Networks , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Ilya Sutskever,et al.  Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[20]  Sanja Fidler,et al.  DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[22]  Nasser M. Nasrabadi,et al.  Style and Content Disentanglement in Generative Adversarial Networks , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[23]  Jaakko Lehtinen,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[25]  Alexei A. Efros,et al.  Swapping Autoencoder for Deep Image Manipulation , 2020, NeurIPS.

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Alexei A. Efros,et al.  Generative Visual Manipulation on the Natural Image Manifold , 2016, ECCV.

[28]  Stefanos Zafeiriou,et al.  ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Phillip Isola,et al.  On the "steerability" of generative adversarial networks , 2019, ICLR.

[30]  Peter Wonka,et al.  SEAN: Image Synthesis With Semantic Region-Adaptive Normalization , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Mario Fritz,et al.  Inclusive GAN: Improving Data and Minority Coverage in Generative Models , 2020, ECCV.

[32]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Bogdan Raducanu,et al.  Invertible Conditional GANs for image editing , 2016, ArXiv.

[34]  David Salesin,et al.  Image Analogies , 2001, SIGGRAPH.

[35]  Jung-Woo Ha,et al.  Photorealistic Style Transfer via Wavelet Transforms , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Adam Finkelstein,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, SIGGRAPH 2009.

[37]  Sylvain Paris,et al.  Deep Photo Style Transfer , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Zhihui Lai,et al.  GuidedStyle: Attribute Knowledge Guided Style Manipulation for Semantic Face Editing , 2020, Neural Networks.

[39]  Sanja Fidler,et al.  DriveGAN: Towards a Controllable High-Quality Neural Simulation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Andrew Brock,et al.  Neural Photo Editing with Introspective Adversarial Networks , 2016, ICLR.

[41]  Daniel Cohen-Or,et al.  Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Andrew Chadwick,et al.  Deepfakes and Disinformation: Exploring the Impact of Synthetic Political Video on Deception, Uncertainty, and Trust in News , 2020, Social Media + Society.

[43]  George Wolberg,et al.  Digital image warping , 1990 .

[44]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[45]  Saeid Nahavandi,et al.  Deep learning for deepfakes creation and detection: A survey , 2019, Comput. Vis. Image Underst..

[46]  David Bau,et al.  Rewriting a Deep Generative Model , 2020, ECCV.

[47]  Subarna Tripathi,et al.  Precise Recovery of Latent Vectors from Generative Adversarial Networks , 2017, ICLR.

[48]  Lingyun Wu,et al.  MaskGAN: Towards Diverse and Interactive Facial Image Manipulation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  C'eline Hudelot,et al.  Controlling generative models with continuous factors of variations , 2020, ICLR.

[50]  Bolei Zhou,et al.  Interpreting the Latent Space of GANs for Semantic Face Editing , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[52]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[53]  Matthias Zwicker,et al.  Faceshop , 2018, ACM Trans. Graph..

[54]  Bolei Zhou,et al.  Seeing What a GAN Cannot Generate , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[55]  Minh N. Do,et al.  Semantic Image Inpainting with Deep Generative Models , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Peter Wonka,et al.  Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[58]  Aude Oliva,et al.  GANalyze: Toward Visual Definitions of Cognitive Image Properties , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[59]  Jaakko Lehtinen,et al.  GANSpace: Discovering Interpretable GAN Controls , 2020, NeurIPS.

[60]  Minyoung Huh,et al.  Transforming and Projecting Images into Class-conditional Generative Networks , 2020, ECCV.

[61]  Jan Kautz,et al.  Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[62]  Shijian Lu,et al.  Cascade EF-GAN: Progressive Facial Expression Editing With Local Focuses , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Sanja Fidler,et al.  Variational Amodal Object Completion , 2020, NeurIPS.

[64]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[65]  Artem Babenko,et al.  Navigating the GAN Parameter Space for Semantic Image Editing , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Samuel B. Williams,et al.  ASSOCIATION FOR COMPUTING MACHINERY , 2000 .

[67]  Eric Horvitz,et al.  Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting , 2019, DGS@ICLR.

[68]  Daniel Cohen-Or,et al.  Designing an encoder for StyleGAN image manipulation , 2021, ACM Trans. Graph..

[69]  Sanja Fidler,et al.  Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering , 2021, ICLR.

[70]  Bolei Zhou,et al.  InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs , 2020, IEEE transactions on pattern analysis and machine intelligence.

[71]  Shiguang Shan,et al.  AttGAN: Facial Attribute Editing by Only Changing What You Want , 2017, IEEE Transactions on Image Processing.

[72]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[73]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[74]  Stefano Ermon,et al.  Fair Generative Modeling via Weak Supervision , 2020, ICML.

[75]  Ira Kemelmacher-Shlizerman,et al.  VOGUE: Try-On by StyleGAN Interpolation Optimization , 2021, ArXiv.

[76]  Taesung Park,et al.  Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[77]  Arthur Gretton,et al.  Demystifying MMD GANs , 2018, ICLR.

[78]  Yuqi Li,et al.  GAN-Based Projector for Faster Recovery With Convergence Guarantees in Linear Inverse Problems , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[79]  Bolei Zhou,et al.  GAN Dissection: Visualizing and Understanding Generative Adversarial Networks , 2018, ICLR.

[80]  Scott Schaefer,et al.  Image deformation using moving least squares , 2006, ACM Trans. Graph..

[81]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[82]  Deli Zhao,et al.  In-Domain GAN Inversion for Real Image Editing , 2020, ECCV.

[83]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[84]  Yuri Viazovetskyi,et al.  StyleGAN2 Distillation for Feed-forward Image Manipulation , 2020, ECCV.

[85]  Binxu Wang,et al.  A Geometric Analysis of Deep Generative Image Models and Its Applications , 2021, ICLR.

[86]  Xueting Li,et al.  A Closed-form Solution to Photorealistic Image Stylization , 2018, ECCV.

[87]  Alexei A. Efros,et al.  Image quilting for texture synthesis and transfer , 2001, SIGGRAPH.

[88]  Erik Reinhard,et al.  Color Transfer between Images , 2001, IEEE Computer Graphics and Applications.

[89]  Peter Wonka,et al.  Disentangled Image Generation Through Structured Noise Injection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).