Two Generator Game: Learning to Sample via Linear Goodness-of-Fit Test

Learning the probability distribution of high-dimensional data is a challenging problem. To solve this problem, we formulate a deep energy adversarial network (DEAN), which casts the energy model learned from real data into an optimization of a goodness-of-fit (GOF) test statistic. DEAN can be interpreted as a GOF game between two generative networks, where one explicit generative network learns an energy-based distribution that fits the real data, and the other implicit generative network is trained by minimizing a GOF test statistic between the energy-based distribution and the generated data, such that the underlying distribution of the generated data is close to the energy-based distribution. We design a two-level alternative optimization procedure to train the explicit and implicit generative networks, such that the hyper-parameters can also be automatically learned. Experimental results show that DEAN achieves high quality generations compared to the state-of-the-art approaches.

[1]  Arthur Gretton,et al.  A Kernel Test of Goodness of Fit , 2016, ICML.

[2]  Shizhong Liao,et al.  Model Selection with the Covering Number of the Ball of RKHS , 2014, CIKM.

[3]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[4]  Yiannis Demiris,et al.  MAGAN: Margin Adaptation for Generative Adversarial Networks , 2017, ArXiv.

[5]  Kenji Fukumizu,et al.  A Linear-Time Kernel Goodness-of-Fit Test , 2017, NIPS.

[6]  Arthur Gretton,et al.  On gradient regularizers for MMD GANs , 2018, NeurIPS.

[7]  Dilin Wang,et al.  Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.

[8]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jiquan Ngiam,et al.  Learning Deep Energy Models , 2011, ICML.

[10]  C. Stein A bound for the error in the normal approximation to the distribution of a sum of dependent random variables , 1972 .

[11]  Weiping Wang,et al.  Fast Cross-Validation , 2018, IJCAI.

[12]  Shizhong Liao,et al.  Approximate Consistency: Towards Foundations of Approximate Kernel Selection , 2014, ECML/PKDD.

[13]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[14]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[15]  Bernhard Egger,et al.  Markov Chain Monte Carlo for Automated Face Image Analysis , 2016, International Journal of Computer Vision.

[16]  Aapo Hyvärinen,et al.  Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..

[17]  David Berthelot,et al.  BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[18]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[19]  Bernhard Schölkopf,et al.  Hilbert Space Embeddings and Metrics on Probability Measures , 2009, J. Mach. Learn. Res..

[20]  Yingtao Tian,et al.  Towards the Automatic Anime Characters Creation with Generative Adversarial Networks , 2017, ArXiv.

[21]  Ingo Steinwart,et al.  On the Influence of the Kernel on the Consistency of Support Vector Machines , 2002, J. Mach. Learn. Res..

[22]  J. Kiefer,et al.  CONSISTENCY OF THE MAXIMUM LIKELIHOOD ESTIMATOR IN THE PRESENCE OF INFINITELY MANY INCIDENTAL PARAMETERS , 1956 .

[23]  A. Müller Integral Probability Metrics and Their Generating Classes of Functions , 1997, Advances in Applied Probability.

[24]  Qiang Liu,et al.  Stein Variational Gradient Descent as Moment Matching , 2018, NeurIPS.

[25]  Lester W. Mackey,et al.  Measuring Sample Quality with Kernels , 2017, ICML.

[26]  Richard S. Zemel,et al.  Generative Moment Matching Networks , 2015, ICML.

[27]  Philip Bachman,et al.  Calibrating Energy-based Generative Adversarial Networks , 2017, ICLR.

[28]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[29]  Yann LeCun,et al.  Energy-based Generative Adversarial Network , 2016, ICLR.

[30]  Bernhard Schölkopf,et al.  Deep Energy Estimator Networks , 2018, ArXiv.

[31]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[32]  Yang Lu,et al.  A Theory of Generative ConvNet , 2016, ICML.

[33]  Yuesheng Xu,et al.  Universal Kernels , 2006, J. Mach. Learn. Res..

[34]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[35]  Dilin Wang,et al.  Learning to Draw Samples with Amortized Stein Variational Gradient Descent , 2017, UAI.

[36]  Sivaraman Balakrishnan,et al.  Optimal kernel choice for large-scale two-sample tests , 2012, NIPS.

[37]  Yong Liu,et al.  Fast Cross-Validation for Kernel-Based Algorithms , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Ling Shao,et al.  Approximate Kernel Selection with Strong Approximate Consistency , 2019, AAAI.

[39]  Yiming Yang,et al.  MMD GAN: Towards Deeper Understanding of Moment Matching Network , 2017, NIPS.

[40]  Qiang Liu,et al.  A Kernelized Stein Discrepancy for Goodness-of-fit Tests , 2016, ICML.

[41]  Aaron Palmer,et al.  Reforming Generative Autoencoders via Goodness-of-Fit Hypothesis Testing , 2018, UAI.

[42]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[43]  Shuo Yang,et al.  From Facial Parts Responses to Face Detection: A Deep Learning Approach , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[44]  Ling Shao,et al.  Linear Kernel Tests via Empirical Likelihood for High-Dimensional Data , 2019, AAAI.

[45]  Le Song,et al.  A Kernel Statistical Test of Independence , 2007, NIPS.

[46]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[47]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[48]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[49]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[50]  Lizhong Ding,et al.  An Approximate Approach to Automatic Kernel Selection. , 2017, IEEE transactions on cybernetics.

[51]  Wei Wang,et al.  Improving MMD-GAN Training with Repulsive Loss Function , 2018, ICLR.

[52]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[53]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[54]  Bernhard Schölkopf,et al.  Kernel Choice and Classifiability for RKHS Embeddings of Probability Distributions , 2009, NIPS.

[55]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[57]  Wojciech Zaremba,et al.  B-test: A Non-parametric, Low Variance Kernel Two-sample Test , 2013, NIPS.

[58]  Alexander J. Smola,et al.  Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy , 2016, ICLR.

[59]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[60]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[61]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[62]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[63]  Arthur Gretton,et al.  Demystifying MMD GANs , 2018, ICLR.

[64]  John E. Hopcroft,et al.  Stacked Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Imre Csiszár,et al.  Information Theory and Statistics: A Tutorial , 2004, Found. Trends Commun. Inf. Theory.

[66]  Yong Liu,et al.  Randomized Kernel Selection With Spectra of Multilevel Circulant Matrices , 2018, AAAI.

[67]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[68]  Dilin Wang,et al.  Learning to Draw Samples: With Application to Amortized MLE for Generative Adversarial Learning , 2016, ArXiv.

[69]  Zoubin Ghahramani,et al.  Training generative neural networks via Maximum Mean Discrepancy optimization , 2015, UAI.

[70]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[71]  N. Chopin,et al.  Control functionals for Monte Carlo integration , 2014, 1410.2392.

[72]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[73]  Eric P. Xing,et al.  On Unifying Deep Generative Models , 2017, ICLR.