Augmented Normalizing Flows: Bridging the Gap Between Generative Flows and Latent Variable Models

In this work, we propose a new family of generative flows on an augmented data space, with an aim to improve expressivity without drastically increasing the computational cost of sampling and evaluation of a lower bound on the likelihood. Theoretically, we prove the proposed flow can approximate a Hamiltonian ODE as a universal transport map. Empirically, we demonstrate state-of-the-art performance on standard benchmarks of flow-based generative modeling.

[1]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[2]  Raquel Urtasun,et al.  The Reversible Residual Network: Backpropagation Without Storing Activations , 2017, NIPS.

[3]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[4]  Max Welling,et al.  Improving Variational Auto-Encoders using Householder Flow , 2016, ArXiv.

[5]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[6]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[7]  Max Welling,et al.  Markov Chain Monte Carlo and Variational Inference: Bridging the Gap , 2014, ICML.

[8]  David Vázquez,et al.  PixelVAE: A Latent Variable Model for Natural Images , 2016, ICLR.

[9]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[10]  David Duvenaud,et al.  Residual Flows for Invertible Generative Modeling , 2019, NeurIPS.

[11]  Pieter Abbeel,et al.  PixelSNAIL: An Improved Autoregressive Generative Model , 2017, ICML.

[12]  Aaron C. Courville,et al.  Hierarchical Adversarially Learned Inference , 2018, ArXiv.

[13]  Ryan P. Adams,et al.  SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models , 2020, ICLR.

[14]  Dustin Tran,et al.  Hierarchical Variational Models , 2015, ICML.

[15]  Yoshua Bengio,et al.  A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.

[16]  Max Welling,et al.  Sylvester Normalizing Flows for Variational Inference , 2018, UAI.

[17]  Philip Bachman,et al.  Calibrating Energy-based Generative Adversarial Networks , 2017, ICLR.

[18]  Yang Song,et al.  MintNet: Building Invertible Neural Networks with Masked Convolutions , 2019, NeurIPS.

[19]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[20]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[21]  Ambedkar Dukkipati,et al.  Deep Variational Inference Without Pixel-Wise Reconstruction , 2016, ArXiv.

[22]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[23]  Max Welling,et al.  Multiplicative Normalizing Flows for Variational Bayesian Neural Networks , 2017, ICML.

[24]  Yee Whye Teh,et al.  Hybrid Models with Deep and Invertible Features , 2019, ICML.

[25]  Wei-Chen Chiu,et al.  Variational Autoencoders with Normalizing Flow Decoders , 2020, ArXiv.

[26]  Arnaud Doucet,et al.  Localised Generative Flows , 2019, ArXiv.

[27]  Alexandre Lacoste,et al.  Neural Autoregressive Flows , 2018, ICML.

[28]  E. Tabak,et al.  DENSITY ESTIMATION BY DUAL ASCENT OF THE LOG-LIKELIHOOD ∗ , 2010 .

[29]  Ole Winther,et al.  Ladder Variational Autoencoders , 2016, NIPS.

[30]  Yu Cheng,et al.  Generative Adversarial Networks as Variational Training of Energy Based Models , 2016, ArXiv.

[31]  Matt Hoffman Langevin Dynamics as Nonparametric Variational Inference , 2019 .

[32]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[33]  Ivan Kobyzev,et al.  Normalizing Flows: An Introduction and Review of Current Methods , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[35]  David Duvenaud,et al.  FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.

[36]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[37]  Rémi Munos,et al.  Autoregressive Quantile Networks for Generative Modeling , 2018, ICML.

[38]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[39]  David Barber,et al.  An Auxiliary Variational Method , 2004, ICONIP.

[40]  Avishek Joey Bose,et al.  Improving Exploration in Soft-Actor-Critic with Normalizing Flows Policies , 2019, ArXiv.

[41]  Dmitry Vetrov,et al.  Semi-Conditional Normalizing Flows for Semi-Supervised Learning , 2019, ArXiv.

[42]  Max Welling,et al.  Emerging Convolutions for Generative Normalizing Flows , 2019, ICML.

[43]  Tim Salimans,et al.  Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.

[44]  Ivan Kobyzev,et al.  Normalizing Flows: Introduction and Ideas , 2019, ArXiv.

[45]  Ryan P. Adams,et al.  High-Dimensional Probability Estimation with Deep Density Models , 2013, ArXiv.

[46]  Ryan P. Adams,et al.  Early Stopping as Nonparametric Variational Inference , 2015, AISTATS.

[47]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[48]  Iain Murray,et al.  Masked Autoregressive Flow for Density Estimation , 2017, NIPS.

[49]  Jascha Sohl-Dickstein,et al.  Generalizing Hamiltonian Monte Carlo with Neural Networks , 2017, ICLR.

[50]  Ole Winther,et al.  BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling , 2019, NeurIPS.

[51]  David Duvenaud,et al.  Inference Suboptimality in Variational Autoencoders , 2018, ICML.

[52]  Stefano Ermon,et al.  A-NICE-MC: Adversarial Training for MCMC , 2017, NIPS.

[53]  R. Tweedie,et al.  Exponential convergence of Langevin distributions and their discrete approximations , 1996 .

[54]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[55]  Amos J. Storkey,et al.  Towards a Neural Statistician , 2016, ICLR.

[56]  E. Tabak,et al.  A Family of Nonparametric Density Estimation Algorithms , 2013 .

[57]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[58]  Pascal Vincent,et al.  Stochastic Neural Network with Kronecker Flow , 2019, AISTATS.

[59]  Eric Nalisnick,et al.  Normalizing Flows for Probabilistic Modeling and Inference , 2019, J. Mach. Learn. Res..

[60]  Andre Wibisono,et al.  A variational perspective on accelerated methods in optimization , 2016, Proceedings of the National Academy of Sciences.

[61]  Mohammad Havaei,et al.  Learnable Explicit Density for Continuous Latent Space and Variational Inference , 2017, ArXiv.

[62]  Nal Kalchbrenner,et al.  Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling , 2018, ICLR.

[63]  David Duvenaud,et al.  Invertible Residual Networks , 2018, ICML.

[64]  Pieter Abbeel,et al.  Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design , 2019, ICML.

[65]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[66]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[67]  Pieter Abbeel,et al.  Variational Lossy Autoencoder , 2016, ICLR.

[68]  Yee Whye Teh,et al.  Augmented Neural ODEs , 2019, NeurIPS.

[69]  Yifei Wang,et al.  Accelerated Information Gradient Flow , 2019, Journal of Scientific Computing.

[70]  Heiga Zen,et al.  Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.

[71]  James F. Epperson,et al.  An Introduction to Numerical Methods and Analysis , 2001 .

[72]  R Devon Hjelm,et al.  Leveraging exploration in off-policy algorithms via normalizing flows , 2019, CoRL.

[73]  Eduard H. Hovy,et al.  MaCow: Masked Convolutional Generative Flow , 2019, NeurIPS.

[74]  Juha Karhunen,et al.  Building Blocks for Variational Bayesian Learning of Latent Variable Models , 2007, J. Mach. Learn. Res..

[75]  Ingmar Schuster,et al.  Set Flow: A Permutation Invariant Normalizing Flow , 2019, ArXiv.

[76]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[77]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[78]  Alexandre Lacoste,et al.  Bayesian Hypernetworks , 2017, ArXiv.