Rethinking Generative Mode Coverage: A Pointwise Guaranteed Approach

Many generative models have to combat $\textit{missing modes}$. The conventional wisdom to this end is by reducing through training a statistical distance (such as $f$-divergence) between the generated distribution and provided data distribution. But this is more of a heuristic than a guarantee. The statistical distance measures a $\textit{global}$, but not $\textit{local}$, similarity between two distributions. Even if it is small, it does not imply a plausible mode coverage. Rethinking this problem from a game-theoretic perspective, we show that a complete mode coverage is firmly attainable. If a generative model can approximate a data distribution moderately well under a global statistical distance measure, then we will be able to find a mixture of generators that collectively covers $\textit{every}$ data point and thus $\textit{every}$ mode, with a lower-bounded generation probability. Constructing the generator mixture has a connection to the multiplicative weights update rule, upon which we propose our algorithm. We prove that our algorithm guarantees complete mode coverage. And our experiments on real and synthetic datasets confirm better mode coverage over recent approaches, ones that also use generator mixtures but rely on global statistical distances.

[1]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[2]  Stefano Ermon,et al.  Boosted Generative Models , 2016, AAAI.

[3]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[4]  Andrew Gordon Wilson,et al.  Bayesian GAN , 2017, NIPS.

[5]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[6]  Elad Hazan,et al.  Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[7]  J. Neumann Zur Theorie der Gesellschaftsspiele , 1928 .

[8]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[9]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[10]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[11]  Changxi Zheng,et al.  BourGAN: Generative Networks with Metric Embeddings , 2018, NeurIPS.

[12]  Charles A. Sutton,et al.  VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning , 2017, NIPS.

[13]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[14]  Sanjeev Arora,et al.  The Multiplicative Weights Update Method: a Meta-Algorithm and Applications , 2012, Theory Comput..

[15]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[16]  Yann LeCun,et al.  Energy-based Generative Adversarial Network , 2016, ICLR.

[17]  Yingyu Liang,et al.  Generalization and Equilibrium in Generative Adversarial Nets (GANs) , 2017, ICML.

[18]  Trung Le,et al.  MGAN: Training Generative Adversarial Nets with Multiple Generators , 2018, ICLR.

[19]  P. Pardalos,et al.  Minimax and applications , 1995 .

[20]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[21]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[22]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[23]  Shun-ichi Amari,et al.  Information Geometry and Its Applications , 2016 .

[24]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[25]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[26]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[27]  Changxi Zheng,et al.  Rethinking Generative Coverage: A Pointwise Guaranteed Approach , 2019, NeurIPS 2019.

[28]  Andreas Krause,et al.  An Online Learning Approach to Generative Adversarial Networks , 2017, ICLR.

[29]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[30]  Joost van de Weijer,et al.  Ensembles of Generative Adversarial Networks , 2016, ArXiv.

[31]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[32]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[33]  Bernhard Schölkopf,et al.  AdaGAN: Boosting Generative Models , 2017, NIPS.

[34]  Ashish Khetan,et al.  PacGAN: The Power of Two Samples in Generative Adversarial Networks , 2017, IEEE Journal on Selected Areas in Information Theory.

[35]  Jaegul Choo,et al.  MEGAN: Mixture of Experts of Generative Adversarial Networks for Multimodal Image Generation , 2018, IJCAI.

[36]  Gunnar Rätsch,et al.  Clustering Meets Implicit Generative Models , 2018, ICLR.

[37]  Yoshua Bengio,et al.  Mode Regularized Generative Adversarial Networks , 2016, ICLR.

[38]  Bo Zhang,et al.  Graphical Generative Adversarial Networks , 2018, NeurIPS.

[39]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[42]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[43]  Rishi Sharma,et al.  A Note on the Inception Score , 2018, ArXiv.

[44]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[45]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.