Disentangling Factors of Variations Using Few Labels

Learning disentangled representations is considered a cornerstone problem in representation learning. Recently, Locatello et al. (2019) demonstrated that unsupervised disentanglement learning without inductive biases is theoretically impossible and that existing inductive biases and unsupervised methods do not allow to consistently learn disentangled representations. However, in many practical settings, one might have access to a limited amount of supervision, for example through manual labeling of (some) factors of variation in a few training examples. In this paper, we investigate the impact of such supervision on state-of-the-art disentanglement methods and perform a large scale study, training over 52000 models under well-defined and reproducible experimental conditions. We observe that a small number of labeled examples (0.01--0.5\% of the data set), with potentially imprecise and incomplete labels, is sufficient to perform model selection on state-of-the-art unsupervised models. Further, we investigate the benefit of incorporating supervision into the training process. Overall, we empirically validate that with little and imprecise supervision it is possible to reliably learn disentangled representations.

[1]  Ole Winther,et al.  A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning , 2017, NIPS.

[2]  Philippe Beaudoin,et al.  Disentangling the independently controllable factors of variation by interacting with the world , 2018, ArXiv.

[3]  Michael C. Mozer,et al.  Learning Deep Disentangled Embeddings with the F-Statistic Loss , 2018, NeurIPS.

[4]  Sjoerd van Steenkiste,et al.  Are Disentangled Representations Helpful for Abstract Visual Reasoning? , 2019, NeurIPS.

[5]  Xavier Binefa,et al.  Learning Disentangled Representations with Reference-Based Variational Autoencoders , 2019, ICLR 2019.

[6]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[7]  Alexander D'Amour,et al.  On Multi-Cause Approaches to Causal Inference with Unobserved Counfounding: Two Cautionary Failure Cases and A Promising Alternative , 2019, AISTATS.

[8]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Frank D. Wood,et al.  Learning Disentangled Representations with Semi-Supervised Deep Generative Models , 2017, NIPS.

[10]  Stefan Bauer,et al.  On the Fairness of Disentangled Representations , 2019, NeurIPS.

[11]  Bruno A. Olshausen,et al.  Discovering Hidden Factors of Variation in Deep Networks , 2014, ICLR.

[12]  Bernhard Schölkopf,et al.  Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[13]  Vighnesh Birodkar,et al.  Unsupervised Learning of Disentangled Representations from Video , 2017, NIPS.

[14]  Quoc V. Le,et al.  Measuring Invariances in Deep Networks , 2009, NIPS.

[15]  Gunnar Rätsch,et al.  SOM-VAE: Interpretable Discrete Representation Learning on Time Series , 2018, ICLR 2018.

[16]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[17]  Guillaume Desjardins,et al.  Understanding disentangling in $\beta$-VAE , 2018, 1804.03599.

[18]  Max Welling,et al.  Learning the Irreducible Representations of Commutative Lie Groups , 2014, ICML.

[19]  Joshua B. Tenenbaum,et al.  Understanding Visual Concepts with Continuation Learning , 2016, ArXiv.

[20]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[21]  Juan Carlos Niebles,et al.  Learning to Decompose and Disentangle Representations for Video Prediction , 2018, NeurIPS.

[22]  Abhishek Kumar,et al.  Variational Inference of Disentangled Latent Concepts from Unlabeled Observations , 2017, ICLR.

[23]  Serge J. Belongie,et al.  Bayesian representation learning with oracle constraints , 2015, ICLR 2016.

[24]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003 .

[25]  Yann LeCun,et al.  Disentangling factors of variation in deep representation using adversarial training , 2016, NIPS.

[26]  Scott E. Reed,et al.  Weakly-supervised Disentangling with Recurrent Transformations for 3D View Synthesis , 2015, NIPS.

[27]  Aapo Hyvärinen,et al.  Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA , 2016, NIPS.

[28]  Geoffrey E. Hinton,et al.  Transforming Auto-Encoders , 2011, ICANN.

[29]  Richard S. Zemel,et al.  Learning Latent Subspaces in Variational Autoencoders , 2018, NeurIPS.

[30]  Yuting Zhang,et al.  Learning to Disentangle Factors of Variation with Manifold Interaction , 2014, ICML.

[31]  J. Karhunen,et al.  Advances in Nonlinear Blind Source Separation , 2003 .

[32]  Andrea Vedaldi,et al.  Understanding Image Representations by Measuring Their Equivariance and Equivalence , 2014, International Journal of Computer Vision.

[33]  Sebastian Nowozin,et al.  Multi-Level Variational Autoencoder: Learning Disentangled Representations from Grouped Observations , 2017, AAAI.

[34]  Yee Whye Teh,et al.  Disentangling Disentanglement in Variational Autoencoders , 2018, ICML.

[35]  Yann LeCun,et al.  Learning to Linearize Under Uncertainty , 2015, NIPS.

[36]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[37]  Olivier Bachem,et al.  Recent Advances in Autoencoder-Based Representation Learning , 2018, ArXiv.

[38]  Stephan Mandt,et al.  Disentangled Sequential Autoencoder , 2018, ICML.

[39]  Andriy Mnih,et al.  Disentangling by Factorising , 2018, ICML.

[40]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[41]  Gunnar Rätsch,et al.  Competitive Training of Mixtures of Independent Deep Generative Models , 2018 .

[42]  Roger B. Grosse,et al.  Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.

[43]  Stefan Bauer,et al.  Interventional Robustness of Deep Latent Variable Models , 2018, ArXiv.

[44]  Yuting Zhang,et al.  Deep Visual Analogy-Making , 2015, NIPS.

[45]  Christopher Burgess,et al.  DARLA: Improving Zero-Shot Transfer in Reinforcement Learning , 2017, ICML.

[46]  Aapo Hyvärinen,et al.  Nonlinear ICA Using Auxiliary Variables and Generalized Contrastive Learning , 2018, AISTATS.

[47]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[48]  Joshua B. Tenenbaum,et al.  Deep Convolutional Inverse Graphics Network , 2015, NIPS.

[49]  Christopher K. I. Williams,et al.  A Framework for the Quantitative Evaluation of Disentangled Representations , 2018, ICLR.

[50]  Bernhard Schölkopf,et al.  On causal and anticausal learning , 2012, ICML.

[51]  Stefan Bauer,et al.  On the Transfer of Inductive Bias from Simulation to the Real World: a New Disentanglement Dataset , 2019, NeurIPS.

[52]  Zoubin Ghahramani,et al.  Discovering Interpretable Representations for Both Deep Generative and Discriminative Models , 2018, ICML.

[53]  Gunnar Rätsch,et al.  SOM-VAE: Interpretable Discrete Representation Learning on Time Series , 2019, ICLR.

[54]  Bernhard Schölkopf,et al.  Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .

[55]  Sergey Levine,et al.  Visual Reinforcement Learning with Imagined Goals , 2018, NeurIPS.

[56]  David Pfau,et al.  Towards a Definition of Disentangled Representations , 2018, ArXiv.

[57]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[58]  Bernhard Schölkopf,et al.  The Incomplete Rosetta Stone problem: Identifiability results for Multi-view Nonlinear ICA , 2019, UAI.

[59]  Aapo Hyvärinen,et al.  Nonlinear independent component analysis: Existence and uniqueness results , 1999, Neural Networks.

[60]  Tim Verbelen,et al.  Improving Generalization for Abstract Reasoning Tasks Using Disentangled Feature Representations , 2018, NIPS 2018.

[61]  Murray Shanahan,et al.  SCAN: Learning Hierarchical Compositional Visual Concepts , 2017, ICLR.

[62]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[63]  Jürgen Schmidhuber,et al.  Learning Factorial Codes by Predictability Minimization , 1992, Neural Computation.

[64]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[65]  Pierre-Yves Oudeyer,et al.  Curiosity Driven Exploration of Learned Disentangled Goal Spaces , 2018, CoRL.

[66]  Yu Zhang,et al.  Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data , 2017, NIPS.

[67]  Yisong Yue,et al.  Factorized Variational Autoencoders for Modeling Audience Reactions to Movies , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Bernhard Schölkopf,et al.  Learning Disentangled Representations with Wasserstein Auto-Encoders , 2018, ICLR.

[69]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..