WiSE-VAE: Wide Sample Estimator VAE

Variational Auto-encoders (VAEs) have been very successful as methods for forming compressed latent representations of complex, often high-dimensional, data. In this paper, we derive an alternative variational lower bound from the one common in VAEs, which aims to minimize aggregate information loss. Using our lower bound as the objective function for an auto-encoder enables us to place a prior on the bulk statistics, corresponding to an aggregate posterior of all latent codes, as opposed to a single code posterior as in the original VAE. This alternative form of prior constraint allows individual posteriors more flexibility to preserve necessary information for good reconstruction quality. We further derive an analytic approximation to our lower bound, leading to our proposed model - WiSE-VAE. Through various examples, we demonstrate that WiSE-VAE can reach excellent reconstruction quality in comparison to other state-of-the-art VAE models, while still retaining the ability to learn a smooth, compact representation.

[1]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[2]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[3]  Murray Shanahan,et al.  Deep Unsupervised Clustering with Gaussian Mixture Variational Autoencoders , 2016, ArXiv.

[4]  C. Villani Optimal Transport: Old and New , 2008 .

[5]  Stefan Leutenegger,et al.  LS-Net: Learning to Solve Nonlinear Least Squares for Monocular Stereo , 2018, ECCV.

[6]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[7]  Stephen J. Roberts,et al.  Loss-Calibrated Approximate Inference in Bayesian Neural Networks , 2018, ArXiv.

[8]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[9]  Terrence J. Sejnowski,et al.  Unsupervised Learning , 2018, Encyclopedia of GIS.

[10]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[11]  Roger B. Grosse,et al.  Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.

[12]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Bernhard Schölkopf,et al.  Wasserstein Auto-Encoders , 2017, ICLR.

[14]  Zoubin Ghahramani,et al.  Approximate inference for the loss-calibrated Bayesian , 2011, AISTATS.

[15]  Andriy Mnih,et al.  Disentangling by Factorising , 2018, ICML.

[16]  Michael Satosi Watanabe,et al.  Information Theoretical Analysis of Multivariate Correlation , 1960, IBM J. Res. Dev..

[17]  Stefan Leutenegger,et al.  CodeSLAM - Learning a Compact, Optimisable Representation for Dense Visual SLAM , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Geoffrey E. Hinton Reducing the Dimensionality of Data with Neural , 2008 .

[19]  Frank Nielsen,et al.  Sinkhorn AutoEncoders , 2018, UAI.

[20]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[21]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[22]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[23]  Christopher K. I. Williams,et al.  A Framework for the Quantitative Evaluation of Disentangled Representations , 2018, ICLR.