OOGAN: Disentangling GAN with One-Hot Sampling and Orthogonal Regularization

Exploring the potential of GANs for unsupervised disentanglement learning, this paper proposes a novel GAN-based disentanglement framework with One-Hot Sampling and Orthogonal Regularization (OOGAN). While previous works mostly attempt to tackle disentanglement learning through VAE and seek to implicitly minimize the Total Correlation (TC) objective with various sorts of approximation methods, we show that GANs have a natural advantage in disentangling with an alternating latent variable (noise) sampling method that is straightforward and robust. Furthermore, we provide a brand-new perspective on designing the structure of the generator and discriminator, demonstrating that a minor structural change and an orthogonal regularization on model weights entails an improved disentanglement. Instead of experimenting on simple toy datasets, we conduct experiments on higher-resolution images and show that OOGAN greatly pushes the boundary of unsupervised disentanglement.

[1]  Lucas Theis,et al.  Amortised MAP Inference for Image Super-resolution , 2016, ICLR.

[2]  Andrew Brock,et al.  Neural Photo Editing with Introspective Adversarial Networks , 2016, ICLR.

[3]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[4]  Jianwen Xie,et al.  Learning Feature-to-Feature Translator by Alternating Back-Propagation for Zero-Shot Learning , 2019, ArXiv.

[5]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[6]  Xianglong Liu,et al.  Orthogonal Weight Normalization: Solution to Optimization over Multiple Dependent Stiefel Manifolds in Deep Neural Networks , 2017, AAAI.

[7]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[8]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[9]  Ahmed M. Elgammal,et al.  Link the Head to the "Beak": Zero Shot Learning from Noisy Text Description at Part Precision , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Alexander A. Alemi,et al.  An Information-Theoretic Analysis of Deep Latent-Variable Models , 2017, ArXiv.

[11]  Xi Peng,et al.  A Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Guillaume Desjardins,et al.  Understanding disentangling in β-VAE , 2018, ArXiv.

[13]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[14]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[15]  Emilien Dupont,et al.  Joint-VAE: Learning Disentangled Joint Continuous and Discrete Representations , 2018, NeurIPS.

[16]  Alexei A. Efros,et al.  Seeing 3D Chairs: Exemplar Part-Based 2D-3D Alignment Using a Large Dataset of CAD Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[18]  Guillaume Lample,et al.  Fader Networks: Manipulating Images by Sliding Attributes , 2017, NIPS.

[19]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Nicolas Macris,et al.  Entropy and mutual information in models of deep neural networks , 2018, NeurIPS.

[22]  David D. Cox,et al.  On the information bottleneck theory of deep learning , 2018, ICLR.

[23]  Ahmed M. Elgammal,et al.  CAN: Creative Adversarial Networks, Generating "Art" by Learning About Styles and Deviating from Style Norms , 2017, ICCC.

[24]  Gunhee Kim,et al.  IB-GAN: Disentangled Representation Learning with Information Bottleneck GAN , 2018 .

[25]  Dana H. Brooks,et al.  Structured Disentangled Representations , 2018, AISTATS.

[26]  Jianwen Xie,et al.  Learning Feature-to-Feature Translator by Alternating Back-Propagation for Generative Zero-Shot Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Bingchen Liu,et al.  Finding Principal Semantics of Style in Art , 2018, 2018 IEEE 12th International Conference on Semantic Computing (ICSC).

[28]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Andriy Mnih,et al.  Disentangling by Factorising , 2018, ICML.

[30]  Sewoong Oh,et al.  InfoGAN-CR: Disentangling Generative Adversarial Networks with Contrastive Regularizers , 2019, ICML 2020.

[31]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[32]  Zhiqiang Tang,et al.  Semantic-Guided Multi-Attention Localization for Zero-Shot Learning , 2019, NeurIPS.

[33]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[34]  Max Welling,et al.  Bayesian Compression for Deep Learning , 2017, NIPS.

[35]  Michael Satosi Watanabe,et al.  Information Theoretical Analysis of Multivariate Correlation , 1960, IBM J. Res. Dev..

[36]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[37]  Christopher K. I. Williams,et al.  A Framework for the Quantitative Evaluation of Disentangled Representations , 2018, ICLR.

[38]  Bernhard Schölkopf,et al.  Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[39]  Zhangyang Wang,et al.  Can We Gain More from Orthogonality Regularizations in Training Deep Networks? , 2018, NeurIPS.

[40]  Mohamed Elhoseiny,et al.  The Shape of Art History in the Eyes of the Machine , 2018, AAAI.

[41]  Xiaohan Chen,et al.  Can We Gain More from Orthogonality Regularizations in Training Deep CNNs? , 2018, NeurIPS.

[42]  Guillaume Desjardins,et al.  Understanding disentangling in $\beta$-VAE , 2018, 1804.03599.

[43]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[44]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[45]  Stefano Soatto,et al.  Information Dropout: Learning Optimal Representations Through Noisy Computation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Roger B. Grosse,et al.  Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.

[47]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[48]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.