Your GAN is Secretly an Energy-based Model and You Should use Discriminator Driven Latent Sampling

We show that the sum of the implicit generator log-density $\log p_g$ of a GAN with the logit score of the discriminator defines an energy function which yields the true data density when the generator is imperfect but the discriminator is optimal, thus making it possible to improve on the typical generator (with implicit density $p_g$). To make that practical, we show that sampling from this modified density can be achieved by sampling in latent space according to an energy-based model induced by the sum of the latent prior log-density and the discriminator output score. This can be achieved by running a Langevin MCMC in latent space and then applying the generator function, which we call Discriminator Driven Latent Sampling~(DDLS). We show that DDLS is highly efficient compared to previous methods which work in the high-dimensional pixel space and can be applied to improve on previously trained GANs of many types. We evaluate DDLS on both synthetic and real-world datasets qualitatively and quantitatively. On CIFAR-10, DDLS substantially improves the Inception Score of an off-the-shelf pre-trained SN-GAN~\citep{sngan} from $8.22$ to $9.09$ which is even comparable to the class-conditional BigGAN~\citep{biggan} model. This achieves a new state-of-the-art in unconditional image synthesis setting without introducing extra parameters or additional training.

[1]  Aapo Hyvärinen,et al.  Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.

[2]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[3]  Yee Whye Teh,et al.  Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[4]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[5]  Carlos Guestrin,et al.  Adversarial Fisher Vectors for Unsupervised Representation Learning , 2019, NeurIPS.

[6]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[7]  Jascha Sohl-Dickstein,et al.  A new method for parameter estimation in probabilistic models: Minimum probability flow , 2011, Physical review letters.

[8]  Fu Jie Huang,et al.  A Tutorial on Energy-Based Learning , 2006 .

[9]  Yang Song,et al.  Generative Modeling by Estimating Gradients of the Data Distribution , 2019, NeurIPS.

[10]  Stefano Ermon,et al.  Generative Adversarial Imitation Learning , 2016, NIPS.

[11]  Karen Simonyan,et al.  Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders , 2017, ICML.

[12]  Oriol Vinyals,et al.  Learning Implicit Generative Models with the Method of Learned Moments , 2018, ICML.

[13]  Anna Dai Generative Modeling , 2020 .

[14]  Tian Han,et al.  Divergence Triangle for Joint Training of Generator Model, Energy-Based Model, and Inferential Model , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Tijmen Tieleman,et al.  Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.

[16]  Yann LeCun,et al.  Energy-based Generative Adversarial Network , 2016, ICLR.

[17]  Fan Yang,et al.  Good Semi-supervised Learning That Requires a Bad GAN , 2017, NIPS.

[18]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[19]  Gilles Louppe,et al.  Approximating Likelihood Ratios with Calibrated Discriminative Classifiers , 2015, 1506.02169.

[20]  Igor Mordatch,et al.  Implicit Generation and Modeling with Energy Based Models , 2019, NeurIPS.

[21]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[22]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[23]  Tian Han,et al.  Alternating Back-Propagation for Generator Network , 2016, AAAI.

[24]  Yoshua Bengio,et al.  Deep Directed Generative Models with Energy-Based Probability Estimation , 2016, ArXiv.

[25]  Jason Yosinski,et al.  Metropolis-Hastings Generative Adversarial Networks , 2018, ICML.

[26]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[27]  Yan Wu,et al.  LOGAN: Latent Optimisation for Generative Adversarial Networks , 2019, ArXiv.

[28]  G. Casella,et al.  Generalized Accept-Reject sampling schemes , 2004 .

[29]  George Tucker,et al.  Energy-Inspired Models: Learning with Sampler-Induced Distributions , 2019, NeurIPS.

[30]  Eric Horvitz,et al.  Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting , 2019, DGS@ICLR.

[31]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[32]  Jaakko Lehtinen,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Tian Han,et al.  Joint Training of Variational Auto-Encoder and Latent Energy-Based Model , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[35]  J. B. King Enlightenment Now: The Case for Reason, Science, Humanism, and Progress , 2020, Theology and Science.

[36]  Sergey Levine,et al.  A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models , 2016, ArXiv.

[37]  Myle Ott,et al.  Residual Energy-Based Models for Text Generation , 2020, ICLR.

[38]  Aapo Hyvärinen,et al.  Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..

[39]  Tian Han,et al.  Learning Latent Space Energy-Based Prior Model , 2020, NeurIPS.

[40]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[41]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[42]  Akinori Tanaka,et al.  Discriminator optimal transport , 2019, NeurIPS.

[43]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[44]  Mario Lucic,et al.  Are GANs Created Equal? A Large-Scale Study , 2017, NeurIPS.

[45]  Sebastian Nowozin,et al.  Stabilizing Training of Generative Adversarial Networks through Regularization , 2017, NIPS.

[46]  Jianfeng Feng,et al.  On Fenchel Mini-Max Learning , 2019, NeurIPS.

[47]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[48]  J. Hobson Enlightenment Now: The Case for Reason, Science, Humanism, and Progress , 2019, Occupational Medicine.

[49]  Mohammad Norouzi,et al.  Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One , 2019, ICLR.

[50]  Yoshua Bengio,et al.  Better Mixing via Deep Representations , 2012, ICML.

[51]  Andrew M. Dai,et al.  Flow Contrastive Estimation of Energy-Based Models , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  James Zou,et al.  AI can be sexist and racist — it’s time to make it fair , 2018, Nature.

[53]  Yang Lu,et al.  Learning Generative ConvNets via Multi-grid Modeling and Sampling , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[55]  Trevor Darrell,et al.  Discriminator Rejection Sampling , 2018, ICLR.

[56]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[57]  Yoshua Bengio,et al.  Mode Regularized Generative Adversarial Networks , 2016, ICLR.

[58]  Le Song,et al.  Exponential Family Estimation via Adversarial Dynamics Embedding , 2019, NeurIPS.

[59]  Joshua V. Dillon,et al.  NeuTra-lizing Bad Geometry in Hamiltonian Monte Carlo Using Neural Transport , 2019, 1903.03704.

[60]  Alexei A. Efros,et al.  Generative Visual Manipulation on the Natural Image Manifold , 2016, ECCV.

[61]  Yoshua Bengio,et al.  Maximum Entropy Generators for Energy-Based Models , 2019, ArXiv.

[62]  Stefano Ermon,et al.  Variational Rejection Sampling , 2018, AISTATS.

[63]  Andriy Mnih,et al.  Resampled Priors for Variational Autoencoders , 2018, AISTATS.

[64]  Arthur Gretton,et al.  KALE: When Energy-Based Learning Meets Adversarial Training , 2020, ArXiv.

[65]  Rémi Munos,et al.  Autoregressive Quantile Networks for Generative Modeling , 2018, ICML.

[66]  Yoshua Bengio,et al.  Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[67]  Yang Lu,et al.  Cooperative Learning of Energy-Based Model and Latent Variable Model via MCMC Teaching , 2018, AAAI.

[68]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[69]  Erik Nijkamp,et al.  Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model , 2019, NeurIPS.