论文信息 - Semi-Supervised StyleGAN for Disentanglement Learning

Semi-Supervised StyleGAN for Disentanglement Learning

Disentanglement learning is crucial for obtaining disentangled representations and controllable generation. Current disentanglement methods face several inherent limitations: difficulty with high-resolution images, primarily on learning disentangled representations, and non-identifiability due to the unsupervised setting. To alleviate these limitations, we design new architectures and loss functions based on StyleGAN (Karras et al., 2019), for semi-supervised high-resolution disentanglement learning. We create two complex high-resolution synthetic datasets for systematic testing. We investigate the impact of limited supervision and find that using only 0.25%~2.5% of labeled data is sufficient for good disentanglement on both synthetic and real datasets. We propose new metrics to quantify generator controllability, and observe there may exist a crucial trade-off between disentangled representation learning and controllable generation. We also consider semantic fine-grained image editing to achieve better generalization to unseen images.

[1] Tolga Tasdizen,et al. Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning , 2016, NIPS.

[2] Takeru Miyato,et al. cGANs with Projection Discriminator , 2018, ICLR.

[3] Sebastian Nowozin,et al. Multi-Level Variational Autoencoder: Learning Disentangled Representations from Grouped Observations , 2017, AAAI.

[4] Yong-Liang Yang,et al. HoloGAN: Unsupervised Learning of 3D Representations From Natural Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[5] Pieter Abbeel,et al. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[6] Simon Osindero,et al. Conditional Generative Adversarial Nets , 2014, ArXiv.

[7] Andrea Vedaldi,et al. Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[8] David Berthelot,et al. MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[9] Bernhard Schölkopf,et al. Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[10] Jung-Woo Ha,et al. StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[12] Stefano Soatto,et al. Emergence of Invariance and Disentanglement in Deep Representations , 2017, 2018 Information Theory and Applications Workshop (ITA).

[13] Roger B. Grosse,et al. Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.

[14] Abhishek Kumar,et al. Variational Inference of Disentangled Latent Concepts from Unlabeled Observations , 2017, ICLR.

[15] Andriy Mnih,et al. Disentangling by Factorising , 2018, ICML.

[16] Sewoong Oh,et al. InfoGAN-CR: Disentangling Generative Adversarial Networks with Contrastive Regularizers , 2019, ICML 2020.

[17] Jonathon Shlens,et al. Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[18] Frank D. Wood,et al. Learning Disentangled Representations with Semi-Supervised Deep Generative Models , 2017, NIPS.

[19] Jaakko Lehtinen,et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[20] Jinwen Ma,et al. DNA-GAN: Learning Disentangled Representations from Multi-Attribute Images , 2017, ICLR.

[21] Michael C. Mozer,et al. Learning Deep Disentangled Embeddings with the F-Statistic Loss , 2018, NeurIPS.

[22] Max Welling,et al. Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[23] Stefan Bauer,et al. On the Transfer of Inductive Bias from Simulation to the Real World: a New Disentanglement Dataset , 2019, NeurIPS.

[24] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[25] Otmar Hilliges,et al. Guiding InfoGAN with Semi-supervision , 2017, ECML/PKDD.

[26] Timo Aila,et al. A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Aapo Hyvärinen,et al. Nonlinear independent component analysis: Existence and uniqueness results , 1999, Neural Networks.

[28] Kimmo Kärkkäinen,et al. FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age , 2019, ArXiv.

[29] Christopher Burgess,et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[30] Joshua B. Tenenbaum,et al. Deep Convolutional Inverse Graphics Network , 2015, NIPS.

[31] Serge J. Belongie,et al. Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[32] Stefan Bauer,et al. Disentangling Factors of Variations Using Few Labels , 2020, ICLR.