Labels4Free: Unsupervised Segmentation using StyleGAN

We propose an unsupervised segmentation framework for StyleGAN generated objects. We build on two main observations. First, the features generated by StyleGAN hold valuable information that can be utilized towards training segmentation networks. Second, the foreground and background can often be treated to be largely independent and be swapped across images to produce plausible composited images. For our solution, we propose to augment the StyleGAN2 generator architecture with a segmentation branch and to split the generator into a foreground and background network. This enables us to generate soft segmentation masks for the foreground object in an unsupervised fashion. On multiple object classes, we report comparable results against state-of-the-art supervised segmentation networks, while against the best unsupervised segmentation approach we demonstrate a clear improvement, both in qualitative and quantitative metrics. Project Page : https:/rameenabdal.github.io/Labels4Free

[1]  Peter Wonka,et al.  Image2StyleGAN++: How to Edit the Embedded Images? , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Alexandros G. Dimakis,et al.  Intermediate Layer Optimization for Inverse Problems using Deep Generative Models , 2021, ICML.

[3]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[4]  Arun Mallya,et al.  One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[6]  Adam Bielski,et al.  Emergence of Object Segmentation in Perturbed Generative Models , 2019, NeurIPS.

[7]  Bolei Zhou,et al.  Seeing What a GAN Cannot Generate , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Tero Karras,et al.  Training Generative Adversarial Networks with Limited Data , 2020, NeurIPS.

[9]  Peter Wonka,et al.  StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows , 2020, ArXiv.

[10]  Daniel Cohen-Or,et al.  Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Bolei Zhou,et al.  InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs , 2020, IEEE transactions on pattern analysis and machine intelligence.

[12]  Luc Van Gool,et al.  Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals , 2021, ArXiv.

[13]  Julien Rabin,et al.  Detecting Overfitting of Deep Generative Networks via Latent Recovery , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Gang Yu,et al.  BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation , 2018, ECCV.

[15]  Peter Wonka,et al.  Improved StyleGAN Embedding: Where are the Good Latents? , 2020, ArXiv.

[16]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[17]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[18]  Jaakko Lehtinen,et al.  GANSpace: Discovering Interpretable GAN Controls , 2020, NeurIPS.

[19]  Dani Lischinski,et al.  StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Michael S. Brown,et al.  HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms , 2020, ArXiv.

[21]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[22]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[23]  Gözde B. Ünal,et al.  Patch-Based Image Inpainting with Generative Adversarial Networks , 2018, ArXiv.

[24]  Myriam Tami,et al.  Autoregressive Unsupervised Image Segmentation , 2020, ECCV.

[25]  Christian Theobalt,et al.  PIE , 2020, ACM Trans. Graph..

[26]  Cynthia Rudin,et al.  PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Lingyun Wu,et al.  MaskGAN: Towards Diverse and Interactive Facial Image Manipulation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Thomas S. Huang,et al.  Free-Form Image Inpainting With Gated Convolution , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Bo Dai,et al.  Do 2D GANs Know 3D Shape? Unsupervised 3D shape reconstruction from 2D Image GANs , 2020, ArXiv.

[30]  Deli Zhao,et al.  In-Domain GAN Inversion for Real Image Editing , 2020, ECCV.

[31]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Peter Wonka,et al.  Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33]  Jaakko Lehtinen,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Xu Ji,et al.  Invariant Information Clustering for Unsupervised Image Classification and Segmentation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Raja Bala,et al.  Editing in Style: Uncovering the Local Semantics of GANs , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Jason Lawrence,et al.  Time-travel rephotography , 2020, ACM Trans. Graph..

[39]  Bolei Zhou,et al.  GAN Dissection: Visualizing and Understanding Generative Adversarial Networks , 2018, ICLR.

[40]  Christian Theobalt,et al.  StyleRig: Rigging StyleGAN for 3D Control Over Portrait Images , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Bolei Zhou,et al.  Generative Hierarchical Features from Synthesizing Images , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Ludovic Denoyer,et al.  Unsupervised Object Segmentation by Redrawing , 2019, NeurIPS.