Improved Variational Inference with Inverse Autoregressive Flow

The framework of normalizing flows provides a general strategy for flexible variational inference of posteriors over latent variables. We propose a new type of normalizing flow, inverse autoregressive flow (IAF), that, in contrast to earlier published flows, scales well to high-dimensional latent spaces. The proposed flow consists of a chain of invertible transformations, where each transformation is based on an autoregressive neural network. In experiments, we show that IAF significantly improves upon diagonal Gaussian approximate posteriors. In addition, we demonstrate that a novel type of variational autoencoder, coupled with IAF, is competitive with neural autoregressive models in terms of attained log-likelihood on natural images, while allowing significantly faster synthesis.

[1]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .

[2]  Gustavo Deco,et al.  Higher Order Statistical Decorrelation without Information Loss , 1994, NIPS.

[3]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[4]  Graham W. Taylor,et al.  Deconvolutional networks , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Hugo Larochelle,et al.  The Neural Autoregressive Distribution Estimator , 2011, AISTATS.

[6]  Michael I. Jordan,et al.  Variational Bayesian Inference with Stochastic Search , 2012, ICML.

[7]  Hugo Larochelle,et al.  RNADE: The real-valued neural autoregressive density-estimator , 2013, NIPS.

[8]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[9]  Benjamin Schrauwen,et al.  Factoring Variations in Natural Images with Deep Gaussian Mixture Models , 2014, NIPS.

[10]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[11]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[12]  Daan Wierstra,et al.  Deep AutoRegressive Networks , 2013, ICML.

[13]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[14]  Surya Ganguli,et al.  Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.

[15]  Wojciech Zaremba,et al.  An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[16]  Max Welling,et al.  Markov Chain Monte Carlo and Variational Inference: Bridging the Gap , 2014, ICML.

[17]  Hugo Larochelle,et al.  MADE: Masked Autoencoder for Distribution Estimation , 2015, ICML.

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[20]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[21]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[22]  Ole Winther,et al.  How to Train Deep Variational Autoencoders and Probabilistic Ladder Networks , 2016, ICML 2016.

[23]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Tim Salimans,et al.  Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.

[25]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[26]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[27]  Ole Winther,et al.  Ladder Variational Autoencoders , 2016, NIPS.

[28]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[29]  Daan Wierstra,et al.  Towards Conceptual Compression , 2016, NIPS.

[30]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[31]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[32]  Dustin Tran,et al.  Variational Gaussian Process , 2015, ICLR.

[33]  Tim Salimans,et al.  A Structured Variational Auto-encoder for Learning Deep Hierarchies of Sparse Features , 2016, ArXiv.

[34]  Dustin Tran,et al.  Hierarchical Variational Models , 2015, ICML.

[35]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[36]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.