Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy

We propose a method to optimize the representation and distinguishability of samples from two probability distributions, by maximizing the estimated power of a statistical test based on the maximum mean discrepancy (MMD). This optimized MMD is applied to the setting of unsupervised learning by generative adversarial networks (GAN), in which a model attempts to generate realistic samples, and a discriminator attempts to tell these apart from data samples. In this context, the MMD may be used in two roles: first, as a discriminator, either directly on the samples, or on features of the samples. Second, the MMD can be used to evaluate the performance of a generative model, by testing the model's samples against a reference data set. In the latter role, the optimized MMD is particularly helpful, as it gives an interpretable indication of how the model and data distributions differ, even in cases where individual model samples are not easily distinguished either by eye or by classifier.

[1]  Ing Rj Ser Approximation Theorems of Mathematical Statistics , 1980 .

[2]  A. Müller Integral Probability Metrics and Their Generating Classes of Functions , 1997, Advances in Applied Probability.

[3]  Stergios B. Fotopoulos,et al.  All of Nonparametric Statistics , 2007, Technometrics.

[4]  Bernhard Schölkopf,et al.  Kernel Measures of Conditional Dependence , 2007, NIPS.

[5]  Martin J. Wainwright,et al.  On Optimal Quantization Rules for Some Problems in Sequential Decentralized Detection , 2006, IEEE Transactions on Information Theory.

[6]  Joaquín Muñoz-García,et al.  A test for the two-sample problem based on empirical characteristic functions , 2008, Comput. Stat. Data Anal..

[7]  Bernhard Schölkopf,et al.  Kernel Choice and Classifiability for RKHS Embeddings of Probability Distributions , 2009, NIPS.

[8]  Zaïd Harchaoui,et al.  A Fast, Consistent Kernel Two-Sample Test , 2009, NIPS.

[9]  Bernhard Schölkopf,et al.  Hilbert Space Embeddings and Metrics on Probability Measures , 2009, J. Mach. Learn. Res..

[10]  Takafumi Kanamori,et al.  Least-squares two-sample test , 2011, Neural Networks.

[11]  Pascal Vincent,et al.  Quickly Generating Representative Samples from an RBM-Derived Process , 2011, Neural Computation.

[12]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[13]  Sivaraman Balakrishnan,et al.  Optimal kernel choice for large-scale two-sample tests , 2012, NIPS.

[14]  Gert R. G. Lanckriet,et al.  On the empirical estimation of integral probability metrics , 2012 .

[15]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[16]  Endong Wang,et al.  Intel Math Kernel Library , 2014 .

[17]  M. Gutmann,et al.  Statistical Inference of Intractable Generative Models via Classification , 2014 .

[18]  Ferenc Huszar,et al.  How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? , 2015, ArXiv.

[19]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[20]  Zoubin Ghahramani,et al.  Training generative neural networks via Maximum Mean Discrepancy optimization , 2015, UAI.

[21]  Sashank J. Reddi,et al.  On the Decreasing Power of Kernel and Distance Based Nonparametric Hypothesis Tests in High Dimensions , 2014, AAAI.

[22]  Zoubin Ghahramani,et al.  Statistical Model Criticism using Kernel Two Sample Tests , 2015, NIPS.

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Richard S. Zemel,et al.  Generative Moment Matching Networks , 2015, ICML.

[25]  Barnabás Póczos,et al.  Adaptivity and Computation-Statistics Tradeoffs for Kernel and Distance based High Dimensional Two Sample Testing , 2015, ArXiv.

[26]  Arthur Gretton,et al.  A Test of Relative Similarity For Model Selection in Generative Models , 2015, ICLR.

[27]  Matthias Bethge,et al.  A note on the evaluation of generative models , 2015, ICLR.

[28]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[29]  Xinyun Chen Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .

[30]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[31]  Omer Levy,et al.  Published as a conference paper at ICLR 2018 S IMULATING A CTION D YNAMICS WITH N EURAL P ROCESS N ETWORKS , 2018 .

[32]  Larry A. Wasserman,et al.  Classification Accuracy as a Proxy for Two Sample Testing , 2016, The Annals of Statistics.