Disentangling by Factorising

We define and address the problem of unsupervised learning of disentangled representations on data generated from independent factors of variation. We propose FactorVAE, a method that disentangles by encouraging the distribution of representations to be factorial and hence independent across the dimensions. We show that it improves upon $\beta$-VAE by providing a better trade-off between disentanglement and reconstruction quality. Moreover, we highlight the problems of a commonly used disentanglement metric and introduce a new metric that does not suffer from them.

[1]  Michael Satosi Watanabe,et al.  Information Theoretical Analysis of Multivariate Correlation , 1960, IBM J. Res. Dev..

[2]  E. Giné,et al.  On the Bootstrap of $U$ and $V$ Statistics , 1992 .

[3]  Jürgen Schmidhuber,et al.  Learning Factorial Codes by Predictability Minimization , 1992, Neural Computation.

[4]  Shun-ichi Amari,et al.  Adaptive Online Learning Algorithms for Blind Separation: Maximum Entropy and Minimum Mutual Information , 1997, Neural Computation.

[5]  Quoc V. Le,et al.  Measuring Invariances in Deep Networks , 2009, NIPS.

[6]  Sami Romdhani,et al.  A 3D Face Model for Pose and Illumination Invariant Face Recognition , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[7]  E. Rolls,et al.  Continuous transformation learning of translation invariant representations , 2010, Experimental Brain Research.

[8]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[9]  Martin J. Wainwright,et al.  Estimating Divergence Functionals and the Likelihood Ratio by Convex Risk Minimization , 2008, IEEE Transactions on Information Theory.

[10]  Geoffrey E. Hinton,et al.  Transforming Auto-Encoders , 2011, ICANN.

[11]  Christopher K. I. Williams,et al.  Transformation Equivariant Boltzmann Machines , 2011, ICANN.

[12]  Yoshua Bengio,et al.  Disentangling Factors of Variation via Generative Entangling , 2012, ArXiv.

[13]  Masashi Sugiyama,et al.  Density-ratio matching under the Bregman divergence: a unified framework of density-ratio estimation , 2012 .

[14]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Max Welling,et al.  Learning the Irreducible Representations of Commutative Lie Groups , 2014, ICML.

[16]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[17]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[18]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[19]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[20]  Alexei A. Efros,et al.  Seeing 3D Chairs: Exemplar Part-Based 2D-3D Alignment Using a Large Dataset of CAD Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Bernhard Schölkopf,et al.  A Permutation-Based Kernel Conditional Independence Test , 2014, UAI.

[22]  Yuting Zhang,et al.  Learning to Disentangle Factors of Variation with Manifold Interaction , 2014, ICML.

[23]  Andrea Vedaldi,et al.  Understanding Image Representations by Measuring Their Equivariance and Equivalence , 2014, International Journal of Computer Vision.

[24]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[25]  Jonathan Tompson,et al.  Unsupervised Learning of Spatiotemporally Coherent Metrics , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  Joshua B. Tenenbaum,et al.  Deep Convolutional Inverse Graphics Network , 2015, NIPS.

[28]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[29]  Andriy Mnih,et al.  Variational Inference for Monte Carlo Objectives , 2016, ICML.

[30]  Shakir Mohamed,et al.  Learning in Implicit Generative Models , 2016, ArXiv.

[31]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[32]  Yann LeCun,et al.  Disentangling factors of variation in deep representation using adversarial training , 2016, NIPS.

[33]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[34]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[35]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[36]  Frank D. Wood,et al.  Learning Disentangled Representations with Semi-Supervised Deep Generative Models , 2017, NIPS.

[37]  Brendan J. Frey,et al.  PixelGAN Autoencoders , 2017, NIPS.

[38]  Vighnesh Birodkar,et al.  Unsupervised Learning of Disentangled Representations from Video , 2017, NIPS.

[39]  Sebastian Nowozin,et al.  Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks , 2017, ICML.

[40]  Lucas Theis,et al.  Amortised MAP Inference for Image Super-resolution , 2016, ICLR.

[41]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[42]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[43]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[44]  Yu Zhang,et al.  Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data , 2017, NIPS.

[45]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[46]  Abhishek Kumar,et al.  Variational Inference of Disentangled Latent Concepts from Unlabeled Observations , 2017, ICLR.

[47]  Roger B. Grosse,et al.  Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.

[48]  Yoshua Bengio,et al.  Learning Independent Features with Adversarial Nets for Non-linear ICA , 2017, 1710.05050.

[49]  Stefano Soatto,et al.  Information Dropout: Learning Optimal Representations Through Noisy Computation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Christopher K. I. Williams,et al.  A Framework for the Quantitative Evaluation of Disentangled Representations , 2018, ICLR.

[51]  Shakir Mohamed,et al.  Distribution Matching in Variational Inference , 2018, ArXiv.