Robust Disentanglement of a Few Factors at a Time

Disentanglement is at the forefront of unsupervised learning, as disentangled representations of data improve generalization, interpretability, and performance in downstream tasks. Current unsupervised approaches remain inapplicable for real-world datasets since they are highly variable in their performance and fail to reach levels of disentanglement of (semi-)supervised approaches. We introduce population-based training (PBT) for improving consistency in training variational autoencoders (VAEs) and demonstrate the validity of this approach in a supervised setting (PBT-VAE). We then use Unsupervised Disentanglement Ranking (UDR) as an unsupervised heuristic to score models in our PBT-VAE training and show how models trained this way tend to consistently disentangle only a subset of the generative factors. Building on top of this observation we introduce the recursive rPU-VAE approach. We train the model until convergence, remove the learned factors from the dataset and reiterate. In doing so, we can label subsets of the dataset with the learned factors and consecutively use these labels to train one model that fully disentangles the whole dataset. With this approach, we show striking improvement in state-of-the-art unsupervised disentanglement performance and robustness across multiple datasets and metrics.

[1]  Michael C. Mozer,et al.  Learning Deep Disentangled Embeddings with the F-Statistic Loss , 2018, NeurIPS.

[2]  Sjoerd van Steenkiste,et al.  Are Disentangled Representations Helpful for Abstract Visual Reasoning? , 2019, NeurIPS.

[3]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Max Jaderberg,et al.  Population Based Training of Neural Networks , 2017, ArXiv.

[5]  Chethan Pandarinath,et al.  Inferring single-trial neural population dynamics using sequential auto-encoders , 2017, Nature Methods.

[6]  Fenglong Ma,et al.  Unsupervised Discovery of Drug Side-Effects from Heterogeneous Data Sources , 2017, KDD.

[7]  Alexander A. Alemi,et al.  Deep Variational Information Bottleneck , 2017, ICLR.

[8]  Andrew McCallum,et al.  Energy and Policy Considerations for Deep Learning in NLP , 2019, ACL.

[9]  Ryan P. Adams,et al.  Mapping Sub-Second Structure in Mouse Behavior , 2015, Neuron.

[10]  Stefan Bauer,et al.  On the Fairness of Disentangled Representations , 2019, NeurIPS.

[11]  Joost van de Weijer,et al.  Image-to-image translation for cross-domain disentanglement , 2018, NeurIPS.

[12]  Bernhard Schölkopf,et al.  Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[13]  Alexander Lerchner,et al.  A Heuristic for Unsupervised Model Selection for Variational Disentangled Representation Learning , 2019, ICLR.

[14]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[15]  Roger B. Grosse,et al.  Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.

[16]  Michael Pfeiffer,et al.  Deep Learning With Spiking Neurons: Opportunities and Challenges , 2018, Front. Neurosci..

[17]  Liwei Wang,et al.  Towards Understanding Learning Representations: To What Extent Do Different Neural Networks Learn the Same Representation , 2018, NeurIPS.

[18]  David Pfau,et al.  Towards a Definition of Disentangled Representations , 2018, ArXiv.

[19]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[20]  Madeleine Gibescu,et al.  Unsupervised energy prediction in a Smart Grid context using reinforcement cross-building transfer learning , 2016 .

[21]  Stefan Bauer,et al.  Disentangling Factors of Variations Using Few Labels , 2020, ICLR.

[22]  Abhishek Kumar,et al.  Variational Inference of Disentangled Latent Concepts from Unlabeled Observations , 2017, ICLR.

[23]  Georg Martius,et al.  Variational Autoencoders Pursue PCA Directions (by Accident) , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Simone Severini,et al.  Learning hard quantum distributions with variational autoencoders , 2017, npj Quantum Information.

[25]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[26]  Francesco Locatello,et al.  Weakly-Supervised Disentanglement Without Compromises , 2020, ICML.

[27]  Samy Bengio,et al.  Insights on representational similarity in neural networks with canonical correlation , 2018, NeurIPS.

[28]  Andriy Mnih,et al.  Disentangling by Factorising , 2018, ICML.

[29]  Ender Konukoglu,et al.  Unsupervised Detection of Lesions in Brain MRI using constrained adversarial auto-encoders , 2018, ArXiv.

[30]  Guillaume Desjardins,et al.  Understanding disentangling in β-VAE , 2018, ArXiv.

[31]  Daniel Rueckert,et al.  Learning Interpretable Anatomical Features Through Deep Generative Models: Application to Cardiac Remodeling , 2018, MICCAI.

[32]  Charles Blundell,et al.  Early Visual Concept Learning with Unsupervised Deep Learning , 2016, ArXiv.

[33]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[34]  Anne Condon,et al.  Interpretable dimensionality reduction of single cell transcriptome data with deep generative models , 2017, Nature Communications.

[35]  Raia Hadsell,et al.  Disentangled Cumulants Help Successor Representations Transfer to New Tasks , 2019, ArXiv.

[36]  Lai Guan Ng,et al.  Dimensionality reduction for visualizing single-cell data using UMAP , 2018, Nature Biotechnology.

[37]  Christopher K. I. Williams,et al.  A Framework for the Quantitative Evaluation of Disentangled Representations , 2018, ICLR.

[38]  Hod Lipson,et al.  Convergent Learning: Do different neural networks learn the same representations? , 2015, FE@NIPS.