Unsupervised Controllable Generation with Self-Training

Recent generative adversarial networks (GANs) are able to generate impressive photo-realistic images. However, controllable generation with GANs remains a challenging research problem. Achieving controllable generation requires semantically interpretable and disentangled factors of variation. It is challenging to achieve this goal using simple fixed distributions such as Gaussian distribution. Instead, we propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training. Self-training provides an iterative feedback in the GAN training, from the discriminator to the generator, and progressively improves the proposal of the latent codes as training proceeds. The latent codes are sampled from a latent variable model that is learned in the feature space of the discriminator. We consider a normalized independent component analysis model and learn its parameters through tensor factorization of the higher-order moments. Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder, and is able to discover semantically meaningful latent codes without any supervision. We demonstrate empirically on both cars and faces datasets that each group of elements in the learned code controls a mode of variation with a semantic meaning, e.g. pose or background change. We also demonstrate with quantitative metrics that our method generates better results compared to other approaches.

[1]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[2]  Max Welling,et al.  VAE with a VampPrior , 2017, AISTATS.

[3]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[4]  Andriy Mnih,et al.  Disentangling by Factorising , 2018, ICML.

[5]  C'eline Hudelot,et al.  Controlling generative models with continuous factors of variations , 2020, ICLR.

[6]  Sewoong Oh,et al.  InfoGAN-CR: Disentangling Generative Adversarial Networks with Contrastive Regularizers , 2019, ICML 2020.

[7]  Sewoong Oh,et al.  InfoGAN-CR and ModelCentrality: Self-supervised Model Training and Selection for Disentangling GANs , 2019, ICML.

[8]  Matthias Zwicker,et al.  Challenges in Disentangling Independent Factors of Variation , 2018, ICLR.

[9]  Aapo Hyvärinen,et al.  Variational Autoencoders and Nonlinear ICA: A Unifying Framework , 2019, AISTATS.

[10]  Gunhee Kim,et al.  IB-GAN: Disentangled Representation Learning with Information Bottleneck GAN , 2018 .

[11]  Dmitry Vetrov,et al.  A Prior of a Googol Gaussians: a Tensor Ring Induced Prior for Generative Models , 2019, NeurIPS.

[12]  Stefanos Zafeiriou,et al.  Π-nets: Deep Polynomial Neural Networks , 2020, ArXiv.

[13]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Artem Babenko,et al.  Unsupervised Discovery of Interpretable Directions in the GAN Latent Space , 2020, ICML.

[15]  Kimmo Kärkkäinen,et al.  FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age , 2019, ArXiv.

[16]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[17]  Gerard de Melo,et al.  OOGAN: Disentangling GAN with One-Hot Sampling and Orthogonal Regularization , 2019 .

[18]  Anjul Patney,et al.  Semi-Supervised StyleGAN for Disentanglement Learning , 2020, ICML.

[19]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[20]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[21]  Jun Zhu,et al.  Triple Generative Adversarial Nets , 2017, NIPS.

[22]  Maja Pantic,et al.  TensorLy: Tensor Learning in Python , 2016, J. Mach. Learn. Res..

[23]  Weijie Zhao,et al.  Multi-Attribute Transfer via Disentangled Representation , 2019, AAAI.

[24]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Arthur Gretton,et al.  KALE: When Energy-Based Learning Meets Adversarial Training , 2020, ArXiv.

[26]  Kunio Kashino,et al.  Generative Adversarial Image Synthesis with Decision Tree Latent Controller , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[28]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[29]  Ye Wang,et al.  FX-GAN: Self-Supervised GAN Learning via Feature Exchange , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[30]  Andriy Mnih,et al.  Resampled Priors for Variational Autoencoders , 2018, AISTATS.

[31]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Stefanos Zafeiriou,et al.  P–nets: Deep Polynomial Neural Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Arthur Gretton,et al.  Generalized Energy Based Models , 2020, ICLR.

[34]  Bernhard Schölkopf,et al.  Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[35]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[36]  Rui Li,et al.  Self-supervised GAN for Image Generation by Correlating Image Channels , 2018, PCM.

[37]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[38]  Anima Anandkumar,et al.  Spectral Methods for Correlated Topic Models , 2017, AISTATS.

[39]  Stefanos Zafeiriou,et al.  PolyGAN: High-Order Polynomial Generators , 2019, ArXiv.

[40]  Xiaohua Zhai,et al.  Self-Supervised GAN to Counter Forgetting , 2018, ArXiv.

[41]  Xiaoming Liu,et al.  Disentangled Representation Learning GAN for Pose-Invariant Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Eric P. Xing,et al.  Structured Generative Adversarial Networks , 2017, NIPS.

[43]  Seunghoon Hong,et al.  High-Fidelity Synthesis with Disentangled Representation , 2020, ECCV.

[44]  Maja Pantic,et al.  Disentangling Geometry and Appearance with Regularised Geometry-Aware Generative Adversarial Networks , 2019, International Journal of Computer Vision.

[45]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[46]  Xiaohua Zhai,et al.  Self-Supervised GANs via Auxiliary Rotation Loss , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Yong Jae Lee,et al.  FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Jinwen Ma,et al.  DNA-GAN: Learning Disentangled Representations from Multi-Attribute Images , 2017, ICLR.

[49]  Roger B. Grosse,et al.  Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.

[50]  Pierre Comon Independent component analysis - a new concept? signal processing , 1994 .

[51]  Yuting Zhang,et al.  Learning to Disentangle Factors of Variation with Manifold Interaction , 2014, ICML.

[52]  Ngai-Man Cheung,et al.  Self-supervised GAN: Analysis and Improvement with Multi-class Minimax Game , 2019, NeurIPS.