On Disentangled Representations Learned from Correlated Data

The focus of disentanglement approaches has been on identifying independent factors of variation in data. However, the causal variables underlying real-world observations are often not statistically independent. In this work, we bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data in a large-scale empirical study (including 4260 models). We show and quantify that systematically induced correlations in the dataset are being learned and reflected in the latent representations, which has implications for downstream applications of disentanglement such as fairness. We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels.

[1]  Joachim M. Buhmann,et al.  Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling , 2021, ICLR.

[2]  Yoshua Bengio,et al.  Towards Causal Representation Learning , 2021, ArXiv.

[3]  B. Schölkopf,et al.  On the Transfer of Disentangled Representations in Realistic Settings , 2020, ICLR.

[4]  Yoshua Bengio,et al.  CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning , 2020, ICLR.

[5]  Matthias Bethge,et al.  Towards Nonlinear Disentanglement in Natural Data with Temporal Sparse Coding , 2020, ICLR.

[6]  Stefano Ermon,et al.  Evaluating the Disentanglement of Deep Generative Models through Manifold Topology , 2020, ICLR.

[7]  Sergey Levine,et al.  Recurrent Independent Mechanisms , 2019, ICLR.

[8]  Francesco Locatello,et al.  A Sober Look at the Unsupervised Learning of Disentangled Representations and their Evaluation , 2020, J. Mach. Learn. Res..

[9]  Jianye Hao,et al.  CausalVAE: Structured Causal Disentanglement in Variational Autoencoder , 2020, ArXiv.

[10]  M. Bethge,et al.  Shortcut learning in deep neural networks , 2020, Nature Machine Intelligence.

[11]  Ben Poole,et al.  Weakly-Supervised Disentanglement Without Compromises , 2020, ICML.

[12]  Ullrich Köthe,et al.  Disentanglement by Nonlinear ICA with General Incompressible-flow Networks (GIN) , 2020, ICLR.

[13]  Ben Poole,et al.  Weakly Supervised Disentanglement with Guarantees , 2019, ICLR.

[14]  Sebastian Nowozin,et al.  Independent Subspace Analysis for Unsupervised Learning of Disentangled Representations , 2019, AISTATS.

[15]  Aapo Hyvärinen,et al.  Variational Autoencoders and Nonlinear ICA: A Unifying Framework , 2019, AISTATS.

[16]  Stefan Bauer,et al.  Disentangling Factors of Variations Using Few Labels , 2020, ICLR.

[17]  Christopher Joseph Pal,et al.  A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms , 2019, ICLR.

[18]  B. Schölkopf,et al.  Causality for Machine Learning , 2019, Probabilistic and Causal Inference.

[19]  Nan Rosemary Ke,et al.  Learning Neural Causal Models from Unknown Interventions , 2019, ArXiv.

[20]  Yongchuan Tang,et al.  Learning Disentangled Representation with Pairwise Independence , 2019, AAAI.

[21]  Stefan Bauer,et al.  On the Transfer of Inductive Bias from Simulation to the Real World: a New Disentanglement Dataset , 2019, NeurIPS.

[22]  Joachim M. Buhmann,et al.  Disentangled State Space Representations , 2019, ArXiv.

[23]  Stefan Bauer,et al.  On the Fairness of Disentangled Representations , 2019, NeurIPS.

[24]  Sjoerd van Steenkiste,et al.  Are Disentangled Representations Helpful for Abstract Visual Reasoning? , 2019, NeurIPS.

[25]  Toniann Pitassi,et al.  Flexibly Fair Representation Learning by Disentanglement , 2019, ICML.

[26]  Bernhard Schölkopf,et al.  The Incomplete Rosetta Stone problem: Identifiability results for Multi-view Nonlinear ICA , 2019, UAI.

[27]  Rob Brekelmans,et al.  Exact Rate-Distortion in Autoencoders via Echo Noise , 2019, NeurIPS.

[28]  Toniann Pitassi,et al.  Fairness through Causal Awareness: Learning Causal Latent-Variable Models for Biased Data , 2018, FAT.

[29]  Yee Whye Teh,et al.  Disentangling Disentanglement in Variational Autoencoders , 2018, ICML.

[30]  Bernhard Schölkopf,et al.  Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[31]  Aapo Hyvärinen,et al.  Nonlinear ICA Using Auxiliary Variables and Generalized Contrastive Learning , 2018, AISTATS.

[32]  Stefano Ermon,et al.  Bias and Generalization in Deep Generative Models: An Empirical Study , 2018, NeurIPS.

[33]  Stefan Bauer,et al.  Interventional Robustness of Deep Latent Variable Models , 2018, ArXiv.

[34]  Sebastian Nowozin,et al.  ISA-VAE: Independent Subspace Analysis with Variational Autoencoders , 2018 .

[35]  Zoubin Ghahramani,et al.  Discovering Interpretable Representations for Both Deep Generative and Discriminative Models , 2018, ICML.

[36]  Guillaume Desjardins,et al.  Understanding disentangling in $\beta$-VAE , 2018, 1804.03599.

[37]  Sotirios A. Tsaftaris,et al.  Factorised spatial representation learning: application in semi-supervised myocardial segmentation , 2018, MICCAI.

[38]  Toniann Pitassi,et al.  Learning Adversarially Fair and Transferable Representations , 2018, ICML.

[39]  Andriy Mnih,et al.  Disentangling by Factorising , 2018, ICML.

[40]  Christopher K. I. Williams,et al.  A Framework for the Quantitative Evaluation of Disentangled Representations , 2018, ICLR.

[41]  Roger B. Grosse,et al.  Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.

[42]  Bernhard Schölkopf,et al.  Learning Independent Causal Mechanisms , 2017, ICML.

[43]  Abhishek Kumar,et al.  Variational Inference of Disentangled Latent Concepts from Unlabeled Observations , 2017, ICLR.

[44]  Murray Shanahan,et al.  SCAN: Learning Hierarchical Compositional Visual Concepts , 2017, ICLR.

[45]  Sebastian Nowozin,et al.  Multi-Level Variational Autoencoder: Learning Disentangled Representations from Grouped Observations , 2017, AAAI.

[46]  Bernhard Schölkopf,et al.  Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .

[47]  Christopher Burgess,et al.  DARLA: Improving Zero-Shot Transfer in Reinforcement Learning , 2017, ICML.

[48]  Bernhard Schölkopf,et al.  Avoiding Discrimination through Causal Reasoning , 2017, NIPS.

[49]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[50]  Aapo Hyvärinen,et al.  Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA , 2016, NIPS.

[51]  Yuting Zhang,et al.  Deep Visual Analogy-Making , 2015, NIPS.

[52]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[53]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[54]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[56]  Sami Romdhani,et al.  A 3D Face Model for Pose and Illumination Invariant Face Recognition , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[57]  Seungjin Choi,et al.  Independent Component Analysis , 2009, Handbook of Natural Computing.

[58]  Theodoros B Grivas,et al.  Correlation of foot length with height and weight in school age children. , 2008, Journal of forensic and legal medicine.

[59]  Mark W. Schmidt,et al.  Learning Graphical Model Structure Using L1-Regularization Paths , 2007, AAAI.

[60]  A. Agnihotri,et al.  Estimation of stature by foot length. , 2007, Journal of forensic and legal medicine.

[61]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[62]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[63]  J. Karhunen,et al.  Advances in Nonlinear Blind Source Separation , 2003 .

[64]  Aapo Hyvärinen,et al.  Nonlinear independent component analysis: Existence and uniqueness results , 1999, Neural Networks.

[65]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[66]  Peter Földiák,et al.  Learning Invariance from Transformation Sequences , 1991, Neural Comput..