Likelihood Almost Free Inference Networks

Variational inference for latent variable models is prevalent in various machine learning problems, typically solved by maximizing the Evidence Lower Bound (ELBO) of the true data likelihood with respect to a variational distribution. However, freely enriching the family of variational distribution is challenging since the ELBO requires variational likelihood evaluations of the latent variables. In this paper, we propose a novel framework to enrich the variational family based on an alternative lower bound, by introducing auxiliary random variables to the variational distribution only. While offering a much richer family of complex variational distributions, the resulting inference network is likelihood almost free in the sense that only the latent variables require evaluations from simple likelihoods and samples from all the auxiliary variables are sufficient for maximum likelihood inference. We show that the proposed approach is essentially optimizing a probabilistic mixture of ELBOs, thus enriching modeling capacity and enhancing robustness. It outperforms state-of-the-art methods in our experiments on several density estimation tasks.

[1]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[2]  Hugo Larochelle,et al.  The Neural Autoregressive Distribution Estimator , 2011, AISTATS.

[3]  Karol Gregor,et al.  Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[4]  Dustin Tran,et al.  Hierarchical Variational Models , 2015, ICML.

[5]  Navdeep Jaitly,et al.  Adversarial Autoencoders , 2015, ArXiv.

[6]  Zoubin Ghahramani,et al.  Proceedings of the 24th international conference on Machine learning , 2007, ICML 2007.

[7]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[8]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[9]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Ole Winther,et al.  Auxiliary Deep Generative Models , 2016, ICML.

[12]  Luc De Raedt,et al.  Proceedings of the 22nd international conference on Machine learning , 2005 .

[13]  U. V. Luxburg,et al.  Improving Variational Autoencoders with Inverse Autoregressive Flow , 2016 .

[14]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[15]  Kilian Q. Weinberger,et al.  Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 , 2016 .

[16]  Joshua B. Tenenbaum,et al.  One-shot learning by inverting a compositional causal process , 2013, NIPS.

[17]  Sebastian Nowozin,et al.  Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks , 2017, ICML.

[18]  N. Brummer Note on the equivalence of hierarchical variational models and auxiliary deep generative models , 2016, 1603.02443.