Quasi-symplectic Langevin Variational Autoencoder

Variational autoencoder (VAE) as one of the well investigated generative model is very popular in nowadays neural learning research works. To leverage VAE in practical tasks which have high dimensions and huge dataset often face the problem of low variance evidence lower bounds construction. Markov chain Monte Carlo (MCMC) is an effective approach to tight the evidence lower bound (ELBO) for approximating the posterior distribution. Hamiltonian Variational Autoencoder (HVAE) is one of the effective MCMC inspired approaches for constructing the unbiased low-variance ELBO which is also amenable for reparameterization trick. The solution significantly improves the performance of the posterior estimation effectiveness, yet, a main drawback of HVAE is the leapfrog method need to access the posterior gradient twice which leads to bad inference efficiency performance and the GPU memory requirement is fair large. This flaw limited the application of Hamiltonian based inference framework for large scale networks inference. To tackle this problem, we propose a Quasi-symplectic Langevin Variational autoencoder (Langevin-VAE), which can be a significant improvement over resource usage efficiency. We qualitatively and quantitatively demonstrate the effectiveness of the Langevin-VAE compared to the state-of-art gradients informed inference framework.

[1]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[2]  Patrick van der Smagt,et al.  Variational Inference with Hamiltonian Monte Carlo , 2016, 1609.08203.

[3]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[4]  Arnaud Doucet,et al.  Hamiltonian Variational Auto-Encoder , 2018, NeurIPS.

[5]  Shiliang Sun,et al.  Dynamical Sampling with Langevin Normalization Flows , 2019, Entropy.

[6]  Hervé Delingette,et al.  Learning a Probabilistic Model for Diffeomorphic Registration , 2018, IEEE Transactions on Medical Imaging.

[7]  M. Girolami,et al.  Riemann manifold Langevin and Hamiltonian Monte Carlo methods , 2011, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[8]  Murat A. Erdogdu,et al.  Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond , 2019, NeurIPS.

[9]  N. G.,et al.  Quasi-symplectic methods for Langevin-type equations , 2003 .

[10]  Eric Nalisnick,et al.  Normalizing Flows for Probabilistic Modeling and Inference , 2019, J. Mach. Learn. Res..

[11]  Mark A. Girolami,et al.  Geometry and Dynamics for Markov Chain Monte Carlo , 2017, ArXiv.

[12]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[13]  Yee Whye Teh,et al.  Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[14]  Ivan Kobyzev,et al.  Normalizing Flows: An Introduction and Review of Current Methods , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  A. Stuart,et al.  Conditional Path Sampling of SDEs and the Langevin MCMC Method , 2004 .

[16]  Francesco Fassò,et al.  Integrable almost-symplectic Hamiltonian systems , 2007 .

[17]  Sebastian Nowozin,et al.  Deterministic Variational Inference for Robust Bayesian Neural Networks , 2018, ICLR.

[18]  Mark A. Girolami,et al.  Information-Geometric Markov Chain Monte Carlo Methods Using Diffusions , 2014, Entropy.

[19]  Martin J. Wainwright,et al.  High-Order Langevin Diffusion Yields an Accelerated MCMC Algorithm , 2019, J. Mach. Learn. Res..

[20]  Yoram Singer,et al.  Memory Efficient Adaptive Optimization , 2019, NeurIPS.

[21]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[22]  Trifce Sandev,et al.  Generalized Langevin Equation , 2019, Fractional Equations and Models.

[23]  Max Welling,et al.  Markov Chain Monte Carlo and Variational Inference: Bridging the Gap , 2014, ICML.

[24]  G. N. Milstein,et al.  Symplectic Integration of Hamiltonian Systems with Additive Noise , 2001, SIAM J. Numer. Anal..

[25]  Sueli I. Rodrigues Costa,et al.  Fisher information distance: a geometrical reading? , 2012, Discret. Appl. Math..