Rethinking Generative Coverage: A Pointwise Guaranteed Approach

All generative models have to combat missing modes. The conventional wisdom is by reducing a statistical distance (such as f-divergence) between the generated distribution and the provided data distribution through training. We defy this wisdom. We show that even a small statistical distance does not imply a plausible mode coverage, because this distance measures a global similarity between two distributions, but not their similarity in local regions--which is needed to ensure a complete mode coverage. From a starkly different perspective, we view the battle against missing modes as a two-player game, between a player choosing a data point and an adversary choosing a generator aiming to cover that data point. Enlightened by von Neumann's minimax theorem, we see that if a generative model can approximate a data distribution moderately well under a global statistical distance measure, then we should be able to find a mixture of generators which collectively covers every data point and thus every mode with a lower-bounded probability density. A constructive realization of this minimax duality--that is, our proposed algorithm of finding the mixture of generators--is connected to a multiplicative weights update rule. We prove the pointwise coverage guarantee of our algorithm, and our experiments on real and synthetic data confirm better mode coverage over recent approaches that also use a mixture of generators but focus on global statistical distances.

[1]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[2]  Yoshua Bengio,et al.  Mode Regularized Generative Adversarial Networks , 2016, ICLR.

[3]  Bo Zhang,et al.  Graphical Generative Adversarial Networks , 2018, NeurIPS.

[4]  Yann LeCun,et al.  Energy-based Generative Adversarial Network , 2016, ICLR.

[5]  Yingyu Liang,et al.  Generalization and Equilibrium in Generative Adversarial Nets (GANs) , 2017, ICML.

[6]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[7]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[8]  Trung Le,et al.  MGAN: Training Generative Adversarial Nets with Multiple Generators , 2018, ICLR.

[9]  J. Neumann Zur Theorie der Gesellschaftsspiele , 1928 .

[10]  Sanjeev Arora,et al.  The Multiplicative Weights Update Method: a Meta-Algorithm and Applications , 2012, Theory Comput..

[11]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[12]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[13]  Elad Hazan,et al.  Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[14]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[15]  Stefano Ermon,et al.  Boosted Generative Models , 2016, AAAI.

[16]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[17]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[18]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[19]  Bernhard Schölkopf,et al.  AdaGAN: Boosting Generative Models , 2017, NIPS.

[20]  Shun-ichi Amari,et al.  Information Geometry and Its Applications , 2016 .

[21]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[22]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[23]  Charles A. Sutton,et al.  VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning , 2017, NIPS.

[24]  Changxi Zheng,et al.  BourGAN: Generative Networks with Metric Embeddings , 2018, NeurIPS.

[25]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[26]  Ashish Khetan,et al.  PacGAN: The Power of Two Samples in Generative Adversarial Networks , 2017, IEEE Journal on Selected Areas in Information Theory.

[27]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[28]  Jaegul Choo,et al.  MEGAN: Mixture of Experts of Generative Adversarial Networks for Multimodal Image Generation , 2018, IJCAI.

[29]  Gunnar Rätsch,et al.  Clustering Meets Implicit Generative Models , 2018, ICLR.

[30]  Joost van de Weijer,et al.  Ensembles of Generative Adversarial Networks , 2016, ArXiv.

[31]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[32]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[33]  Rishi Sharma,et al.  A Note on the Inception Score , 2018, ArXiv.

[34]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[35]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[36]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[37]  Andreas Krause,et al.  An Online Learning Approach to Generative Adversarial Networks , 2017, ICLR.

[38]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[39]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[40]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[41]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[42]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[43]  Andrew Gordon Wilson,et al.  Bayesian GAN , 2017, NIPS.