Unsupervised Controllable Generation with Self-Training

Recent generative adversarial networks (GANs) are able to generate impressive photo-realistic images. However, controllable generation with GANs remains an open research problem. Achieving controllable generation requires semantically interpretable and disentangled factors of variation. It is challenging to achieve this goal using simple fixed distributions such as Gaussian distribution. Instead, we propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training. Self-training provides an iterative feedback in the GAN training, from the discriminator to the generator, and progressively improves the proposal of the latent codes as training proceeds. The latent codes are sampled from a latent variable model that is learned in the feature space of the discriminator. We consider a normalized independent component analysis model and learn its parameters through tensor factorization of the higher-order moments. Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder, and is able to discover semantically meaningful latent codes without any supervision. We empirically demonstrate on both cars and faces datasets that each group of elements in the learned code controls a mode of variation with a semantic meaning, e.g. pose or background change. We also demonstrate with quantitative metrics that our method generates better results compared to other approaches.

[1]  Max Welling,et al.  VAE with a VampPrior , 2017, AISTATS.

[2]  Weijie Zhao,et al.  Multi-Attribute Transfer via Disentangled Representation , 2019, AAAI.

[3]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[4]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[6]  Anjul Patney,et al.  Semi-Supervised StyleGAN for Disentanglement Learning , 2020, ICML.

[7]  Xiaohua Zhai,et al.  Self-Supervised GAN to Counter Forgetting , 2018, ArXiv.

[8]  Sewoong Oh,et al.  InfoGAN-CR and ModelCentrality: Self-supervised Model Training and Selection for Disentangling GANs , 2020, ICML.

[9]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Dmitry Vetrov,et al.  A Prior of a Googol Gaussians: a Tensor Ring Induced Prior for Generative Models , 2019, NeurIPS.

[11]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[12]  Anima Anandkumar,et al.  Spectral Methods for Correlated Topic Models , 2017, AISTATS.

[13]  Stefanos Zafeiriou,et al.  PolyGAN: High-Order Polynomial Generators , 2019, ArXiv.

[14]  Aapo Hyvärinen,et al.  Variational Autoencoders and Nonlinear ICA: A Unifying Framework , 2019, AISTATS.

[15]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[16]  Gerard de Melo,et al.  OOGAN: Disentangling GAN with One-Hot Sampling and Orthogonal Regularization , 2019 .

[17]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[18]  Yong Jae Lee,et al.  FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Matthias Zwicker,et al.  Challenges in Disentangling Independent Factors of Variation , 2018, ICLR.

[20]  Ngai-Man Cheung,et al.  Self-supervised GAN: Analysis and Improvement with Multi-class Minimax Game , 2019, NeurIPS.

[21]  Andriy Mnih,et al.  Resampled Priors for Variational Autoencoders , 2018, AISTATS.

[22]  Stefanos Zafeiriou,et al.  Π-nets: Deep Polynomial Neural Networks , 2020, ArXiv.

[23]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[24]  Gunhee Kim,et al.  IB-GAN: Disentangled Representation Learning with Information Bottleneck GAN , 2018 .

[25]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[26]  Artem Babenko,et al.  Unsupervised Discovery of Interpretable Directions in the GAN Latent Space , 2020, ICML.

[27]  Rui Li,et al.  Self-supervised GAN for Image Generation by Correlating Image Channels , 2018, PCM.

[28]  Ye Wang,et al.  FX-GAN: Self-Supervised GAN Learning via Feature Exchange , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[29]  Pierre Comon Independent component analysis - a new concept? signal processing , 1994 .

[30]  Kimmo Kärkkäinen,et al.  FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age , 2019, ArXiv.

[31]  Xiaoming Liu,et al.  Disentangled Representation Learning GAN for Pose-Invariant Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Jun Zhu,et al.  Triple Generative Adversarial Nets , 2017, NIPS.

[33]  Bernhard Schölkopf,et al.  Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[34]  Stefanos Zafeiriou,et al.  P–nets: Deep Polynomial Neural Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Roger B. Grosse,et al.  Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.

[36]  Maja Pantic,et al.  Disentangling Geometry and Appearance with Regularised Geometry-Aware Generative Adversarial Networks , 2019, International Journal of Computer Vision.

[37]  Jinwen Ma,et al.  DNA-GAN: Learning Disentangled Representations from Multi-Attribute Images , 2017, ICLR.

[38]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Sewoong Oh,et al.  InfoGAN-CR: Disentangling Generative Adversarial Networks with Contrastive Regularizers , 2019, ICML 2020.

[40]  Arthur Gretton,et al.  KALE: When Energy-Based Learning Meets Adversarial Training , 2020, ArXiv.

[41]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[42]  Arthur Gretton,et al.  Generalized Energy Based Models , 2020, ICLR.

[43]  Andriy Mnih,et al.  Disentangling by Factorising , 2018, ICML.

[44]  Eric P. Xing,et al.  Structured Generative Adversarial Networks , 2017, NIPS.

[45]  Yuting Zhang,et al.  Learning to Disentangle Factors of Variation with Manifold Interaction , 2014, ICML.

[46]  Maja Pantic,et al.  TensorLy: Tensor Learning in Python , 2016, J. Mach. Learn. Res..

[47]  Kunio Kashino,et al.  Generative Adversarial Image Synthesis with Decision Tree Latent Controller , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[48]  Xiaohua Zhai,et al.  Self-Supervised GANs via Auxiliary Rotation Loss , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[50]  C'eline Hudelot,et al.  Controlling generative models with continuous factors of variations , 2020, ICLR.

[51]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[52]  Seunghoon Hong,et al.  High-Fidelity Synthesis with Disentangled Representation , 2020, ECCV.