论文信息 - A Contrastive Divergence for Combining Variational Inference and MCMC

A Contrastive Divergence for Combining Variational Inference and MCMC

We develop a method to combine Markov chain Monte Carlo (MCMC) and variational inference (VI), leveraging the advantages of both inference approaches. Specifically, we improve the variational distribution by running a few MCMC steps. To make inference tractable, we introduce the variational contrastive divergence (VCD), a new divergence that replaces the standard Kullback-Leibler (KL) divergence used in VI. The VCD captures a notion of discrepancy between the initial variational distribution and its improved version (obtained after running the MCMC steps), and it converges asymptotically to the symmetrized KL divergence between the variational distribution and the posterior of interest. The VCD objective can be optimized efficiently with respect to the variational parameters via stochastic optimization. We show experimentally that optimizing the VCD leads to better predictive performance on two latent variable models: logistic matrix factorization and variational autoencoders (VAEs).

Francisco J. R. Ruiz | Michalis K. Titsias | M. Titsias

[1] Roland Vollgraf,et al. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[2] D. Mackay,et al. Bayesian neural networks and density networks , 1995 .

[3] Miguel Lázaro-Gredilla,et al. Doubly Stochastic Variational Bayes for non-Conjugate Inference , 2014, ICML.

[4] Scott W. Linderman,et al. Variational Sequential Monte Carlo , 2017, AISTATS.

[5] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[6] H. Robbins. A Stochastic Approximation Method , 1951 .

[7] Michael I. Jordan,et al. Variational Bayesian Inference with Stochastic Search , 2012, ICML.

[8] Matthew D. Hoffman,et al. Learning Deep Latent Gaussian Models with Markov Chain Monte Carlo , 2017, ICML.

[9] Stefano Ermon,et al. Variational Rejection Sampling , 2018, AISTATS.

[10] Matthew King,et al. A Stochastic approximation method for inference in probabilistic graphical models , 2009, NIPS.

[11] José Miguel Hernández-Lobato,et al. Ergodic Inference: Accelerate Convergence by Optimisation , 2018 .