Properties from Mechanisms: An Equivariance Perspective on Identifiable Representation Learning

A key goal of unsupervised representation learning is “inverting” a data generating process to recover its latent properties. Existing work that provably achieves this goal relies on strong assumptions on relationships between the latent variables (e.g., independence conditional on auxiliary information). In this paper, we take a very different perspective on the problem and ask, “Can we instead identify latent properties by leveraging knowledge of the mechanisms that govern their evolution?” We provide a complete characterization of the sources of non-identifiability as we vary knowledge about a set of possible mechanisms. In particular, we prove that if we know the exact mechanisms under which the latent properties evolve, then identification can be achieved up to any equivariances that are shared by the underlying mechanisms. We generalize this characterization to settings where we only know some hypothesis class over possible mechanisms, as well as settings where the mechanisms are stochastic. We demonstrate the power of this mechanism-based perspective by showing that we can leverage our results to generalize existing identifiable representation learning results. These results suggest that by exploiting inductive biases on mechanisms, it is possible to design a range of new identifiable representation learning approaches.

[1]  Walter L. Smith Probability and Statistics , 1959, Nature.

[2]  David Pfau,et al.  Towards a Definition of Disentangled Representations , 2018, ArXiv.

[3]  Diederik P. Kingma,et al.  ICE-BeeM: Identifiable Conditional Energy-Based Deep Models , 2020, NeurIPS.

[4]  Yee Whye Teh,et al.  Probabilistic symmetry and invariant neural networks , 2019, J. Mach. Learn. Res..

[5]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Abhishek Kumar,et al.  Score-Based Generative Modeling through Stochastic Differential Equations , 2020, ICLR.

[7]  Aapo Hyvärinen,et al.  Nonlinear ICA Using Auxiliary Variables and Generalized Contrastive Learning , 2018, AISTATS.

[8]  Tim Austin,et al.  Exchangeable random measures , 2013, 1302.2116.

[9]  Matthias Bethge,et al.  Contrastive Learning Inverts the Data Generating Process , 2021, ICML.

[10]  Joan Bruna,et al.  Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges , 2021, ArXiv.

[11]  Artur Kornilowicz,et al.  Mazur-Ulam Theorem , 2011, Formaliz. Math..

[12]  B. Mityagin The Zero Set of a Real Analytic Function , 2015, Mathematical Notes.

[13]  Aapo Hyvärinen,et al.  Nonlinear ICA of Temporally Dependent Stationary Sources , 2017, AISTATS.

[14]  Bernhard Schölkopf,et al.  Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[15]  Aapo Hyvärinen,et al.  Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA , 2016, NIPS.

[16]  知秀 柴田 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .

[17]  Ali Razavi,et al.  Generating Diverse High-Fidelity Images with VQ-VAE-2 , 2019, NeurIPS.

[18]  Yining Chen,et al.  Weakly Supervised Disentanglement with Guarantees , 2020, ICLR.

[19]  Matthias Bethge,et al.  Towards Nonlinear Disentanglement in Natural Data with Temporal Sparse Coding , 2020, ICLR.

[20]  Ben Poole,et al.  Weakly-Supervised Disentanglement Without Compromises , 2020, ICML.

[21]  Aapo Hyvärinen,et al.  Nonlinear independent component analysis: Existence and uniqueness results , 1999, Neural Networks.

[22]  Richard J. Oosterhoff,et al.  The Mathematical Principles of Natural Philosophy , 2018, Oxford Scholarship Online.

[23]  Aapo Hyvärinen,et al.  Variational Autoencoders and Nonlinear ICA: A Unifying Framework , 2019, AISTATS.

[24]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[25]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[26]  H. P. Annales de l'Institut Henri Poincaré , 1931, Nature.

[27]  M. L. Eaton Group invariance applications in statistics , 1989 .

[28]  Luigi Gresele,et al.  Independent mechanism analysis, a new concept? , 2021, NeurIPS.

[29]  Yoshua Bengio,et al.  Towards Causal Representation Learning , 2021, ArXiv.

[30]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Aapo Hyvarinen,et al.  Hidden Markov Nonlinear ICA: Unsupervised Learning from Nonstationary Time Series , 2020, UAI.