Learning Implicit Generative Models with the Method of Learned Moments

We propose a method of moments (MoM) algorithm for training large-scale implicit generative models. Moment estimation in this setting encounters two problems: it is often difficult to define the millions of moments needed to learn the model parameters, and it is hard to determine which properties are useful when specifying moments. To address the first issue, we introduce a moment network, and define the moments as the network's hidden units and the gradient of the network's output with the respect to its parameters. To tackle the second problem, we use asymptotic theory to highlight desiderata for moments -- namely they should minimize the asymptotic variance of estimated model parameters -- and introduce an objective to learn better moments. The sequence of objectives created by this Method of Learned Moments (MoLM) can train high-quality neural image samplers. On CIFAR-10, we demonstrate that MoLM-trained generators achieve significantly higher Inception Scores and lower Frechet Inception Distances than those trained with gradient penalty-regularized and spectrally-normalized adversarial objectives. These generators also achieve nearly perfect Multi-Scale Structural Similarity Scores on CelebA, and can create high-quality samples of 128x128 images.

[1]  Rishi Sharma,et al.  A Note on the Inception Score , 2018, ArXiv.

[2]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[3]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[4]  Anima Anandkumar,et al.  A Spectral Algorithm for Latent Dirichlet Allocation , 2012, Algorithmica.

[5]  Yiming Yang,et al.  MMD GAN: Towards Deeper Understanding of Moment Matching Network , 2017, NIPS.

[6]  Alexander J. Smola,et al.  Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy , 2016, ICLR.

[7]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[8]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[9]  Richard S. Zemel,et al.  Generative Moment Matching Networks , 2015, ICML.

[10]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[11]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[12]  Anima Anandkumar,et al.  A Method of Moments for Mixture Models and Hidden Markov Models , 2012, COLT.

[13]  Nicol N. Schraudolph,et al.  Fast Curvature Matrix-Vector Products for Second-Order Gradient Descent , 2002, Neural Computation.

[14]  Zoubin Ghahramani,et al.  Training generative neural networks via Maximum Mean Discrepancy optimization , 2015, UAI.

[15]  Philip G. Wright,et al.  The tariff on animal and vegetable oils , 1928 .

[16]  Deyu Meng,et al.  FastMMD: Ensemble of Circular Discrepancy for Efficient Two-Sample Test , 2014, Neural Computation.

[17]  KARL PEARSON Asymmetrical Frequency Curves , 1893, Nature.

[18]  V. P. Godambe An Optimum Property of Regular Maximum Likelihood Estimation , 1960 .

[19]  Sepp Hochreiter,et al.  Coulomb GANs: Provably Optimal Nash Equilibria via Potential Fields , 2017, ICLR.

[20]  Shakir Mohamed,et al.  Learning in Implicit Generative Models , 2016, ArXiv.

[21]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[22]  Karl Pearson,et al.  METHOD OF MOMENTS AND METHOD OF MAXIMUM LIKELIHOOD , 1936 .

[23]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[24]  Arthur Gretton,et al.  Demystifying MMD GANs , 2018, ICLR.

[25]  Marc G. Bellemare,et al.  The Cramer Distance as a Solution to Biased Wasserstein Gradients , 2017, ArXiv.

[26]  D. Duffie,et al.  Simulated Moments Estimation of Markov Models of Asset Prices , 1990 .

[27]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Shakir Mohamed,et al.  Variational Approaches for Auto-Encoding Generative Adversarial Networks , 2017, ArXiv.

[29]  Chirok Han,et al.  Closest moment estimation under general conditions , 2004 .

[30]  Vaibhava Goel,et al.  McGan: Mean and Covariance Feature Matching GAN , 2017, ICML.

[31]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[32]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[33]  Han Zhang,et al.  Improving GANs Using Optimal Transport , 2018, ICLR.

[34]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[35]  L. Hansen Large Sample Properties of Generalized Method of Moments Estimators , 1982 .

[36]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[37]  Gunnar Rätsch,et al.  A New Discriminative Kernel from Probabilistic Models , 2001, Neural Computation.

[38]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[39]  Jacob Abernethy,et al.  On Convergence and Stability of GANs , 2018 .

[40]  Christopher R. Taber,et al.  Generalized Method of Moments , 2020, Time Series Analysis.

[41]  Barak A. Pearlmutter Fast Exact Multiplication by the Hessian , 1994, Neural Computation.

[42]  Kamalika Chaudhuri,et al.  Approximation and Convergence Properties of Generative Adversarial Learning , 2017, NIPS.

[43]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.