Monte Carlo Variational Auto-Encoders

Variational auto-encoders (VAE) are popular deep latent variable models which are trained by maximizing an Evidence Lower Bound (ELBO). To obtain tighter ELBO and hence better variational approximations, it has been proposed to use importance sampling to get a lower variance estimate of the evidence. However, importance sampling is known to perform poorly in high dimensions. While it has been suggested many times in the literature to use more sophisticated algorithms such as Annealed Importance Sampling (AIS) and its Sequential Importance Sampling (SIS) extensions, the potential benefits brought by these advanced techniques have never been realized for VAE: the AIS estimate cannot be easily differentiated, while SIS requires the specification of carefully chosen backward Markov kernels. In this paper, we address both issues and demonstrate the performance of the resulting Monte Carlo VAEs on a variety of applications.

[1]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[2]  L. Tierney Markov Chains for Exploring Posterior Distributions , 1994 .

[3]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[4]  Andriy Mnih,et al.  Variational Inference for Monte Carlo Objectives , 2016, ICML.

[5]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[6]  Justin Domke,et al.  Importance Weighting and Variational Inference , 2018, NeurIPS.

[7]  A. Doucet,et al.  Controlled sequential Monte Carlo , 2017, The Annals of Statistics.

[8]  Eric Moulines,et al.  Nonreversible MCMC from conditional invertible transforms: a complete recipe with convergence guarantees , 2020, 2012.15550.

[9]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[10]  Arnaud Doucet,et al.  Unbiased Gradient Estimation for Variational Auto-Encoders using Coupled Markov Chains , 2021, UAI.

[11]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[13]  Arnaud Doucet,et al.  Hamiltonian Variational Auto-Encoder , 2018, NeurIPS.

[14]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[15]  Radford M. Neal Annealed importance sampling , 1998, Stat. Comput..

[16]  P. Moral,et al.  Sequential Monte Carlo samplers , 2002, cond-mat/0212648.

[17]  Ruslan Salakhutdinov,et al.  On the Quantitative Analysis of Decoder-Based Generative Models , 2016, ICLR.

[18]  Ryan P. Adams,et al.  Sandwiching the marginal likelihood using bidirectional Monte Carlo , 2015, ArXiv.

[19]  Michael I. Miller,et al.  REPRESENTATIONS OF KNOWLEDGE IN COMPLEX SYSTEMS , 1994 .

[20]  Yee Whye Teh,et al.  Filtering Variational Objectives , 2017, NIPS.

[21]  Pierre Monmarch'e,et al.  High-dimensional MCMC with a standard splitting scheme for the underdamped Langevin diffusion. , 2020, Electronic Journal of Statistics.

[22]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[23]  Surya Ganguli,et al.  Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net , 2017, NIPS.

[24]  Alexandre Lacoste,et al.  Improving Explorability in Variational Inference with Annealed Variational Objectives , 2018, NeurIPS.

[25]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[26]  G. Crooks Nonequilibrium Measurements of Free Energy Differences for Microscopically Reversible Markovian Systems , 1998 .

[27]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[28]  Ole Winther,et al.  Auxiliary Deep Generative Models , 2016, ICML.

[29]  Dustin Tran,et al.  Hierarchical Variational Models , 2015, ICML.

[30]  Hao Wu,et al.  Stochastic Normalizing Flows , 2020, NeurIPS.

[31]  Max Welling,et al.  Markov Chain Monte Carlo and Variational Inference: Bridging the Gap , 2014, ICML.

[32]  David J. Freedman,et al.  Learning Deep Generative Models with Annealed Importance Sampling , 2019 .