Coverage and Quality Driven Training of Generative Image Models

Generative modeling of natural images has been extensively studied in recent years, yielding remarkable progress. Current state-of-the-art methods are either based on maximum likelihood estimation or adversarial training. Both methods have their own drawbacks, which are complementary in nature. The first leads to over-generalization as the maximum likelihood criterion encourages models to cover the support of the training data by heavily penalizing small masses assigned to training data. Simplifying assumptions in such models limits their capacity and makes them spill mass on unrealistic samples. The second leads to mode-dropping since adversarial training encourages high quality samples from the model, but only indirectly enforces diversity among the samples. To overcome these drawbacks we make two contributions. First, we propose a model that extends varia-tional autoencoders by using deterministic invertible transformation layers to map samples from the decoder to the image space. This induces correlations among the pixels given the latent variables, improving over commonly used factorial de-coders. Second, we propose a unified training approach that leverages coverage and quality based criteria. Our models obtain likelihood scores competitive with state-of-the-art likelihood-based models, while achieving sample quality typical of adversarially trained networks.

[1]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Jakob Verbeek,et al.  Auxiliary Guided Autoregressive Variational Autoencoders , 2018, ECML/PKDD.

[4]  Ole Winther,et al.  Ladder Variational Autoencoders , 2016, NIPS.

[5]  Navdeep Jaitly,et al.  Adversarial Autoencoders , 2015, ArXiv.

[6]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[7]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[8]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[9]  Lucas Theis,et al.  Amortised MAP Inference for Image Super-resolution , 2016, ICLR.

[10]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[11]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[12]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[13]  Yann Ollivier,et al.  Mixed batches and symmetric discriminators for GAN training , 2018, ICML.

[14]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[15]  Xi Chen,et al.  PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications , 2017, ICLR.

[16]  Andrea Vedaldi,et al.  It Takes (Only) Two: Adversarial Generator-Encoder Networks , 2017, AAAI.

[17]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[18]  Lourdes Agapito,et al.  Structured Uncertainty Prediction Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[20]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[21]  Pieter Abbeel,et al.  Variational Lossy Autoencoder , 2016, ICLR.

[22]  Lawrence Carin,et al.  Symmetric Variational Autoencoder and Connections to Adversarial Learning , 2017, AISTATS.

[23]  Philip Bachman,et al.  An Architecture for Deep, Hierarchical Generative Models , 2016, NIPS.

[24]  U. V. Luxburg,et al.  Improving Variational Autoencoders with Inverse Autoregressive Flow , 2016 .

[25]  Takeru Miyato,et al.  cGANs with Projection Discriminator , 2018, ICLR.

[26]  Richard S. Zemel,et al.  Generative Moment Matching Networks , 2015, ICML.

[27]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[28]  David Vázquez,et al.  PixelVAE: A Latent Variable Model for Natural Images , 2016, ICLR.

[29]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[30]  Ole Winther,et al.  Autoencoding beyond pixels using a learned similarity metric , 2015, ICML.

[31]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[32]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.