A disentangled representation of a data set should be capable of recovering the underlying factors that generated it. One question that arises is whether using Euclidean space for latent variable models can produce a disentangled representation when the underlying generating factors have a certain geometrical structure. Take for example the images of a car seen from different angles. The angle has a periodic structure but a 1-dimensional representation would fail to capture this topology. How can we address this problem? The submissions presented for the first stage of the NeurIPS2019 Disentanglement Challenge consist of a Diffusion Variational Autoencoder ($\Delta$VAE) with a hyperspherical latent space which can, for example, recover periodic true factors. The training of the $\Delta$VAE is enhanced by incorporating a modified version of the Evidence Lower Bound (ELBO) for tailoring the encoding capacity of the posterior approximate.
[1]
Christopher Burgess,et al.
beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework
,
2016,
ICLR 2016.
[2]
Max Welling,et al.
Auto-Encoding Variational Bayes
,
2013,
ICLR.
[3]
Guillaume Desjardins,et al.
Understanding disentangling in $\beta$-VAE
,
2018,
1804.03599.
[4]
Nicola De Cao,et al.
Explorations in Homeomorphic Variational Auto-Encoding
,
2018,
ArXiv.
[5]
Nicola De Cao,et al.
Hyperspherical Variational Auto-Encoders
,
2018,
UAI 2018.
[6]
Vlado Menkovski,et al.
Diffusion Variational Autoencoders
,
2019,
IJCAI.
[7]
Bernhard Schölkopf,et al.
Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations
,
2018,
ICML.