论文信息 - GenEval: A Benchmark Suite for Evaluating Generative Models

GenEval: A Benchmark Suite for Evaluating Generative Models

Generative models are important for several practical applications, from low level image processing tasks, to model-based planning in robotics. More generally, the study of generative models is motivated by the long-standing endeavor to model uncertainty and to discover structure by leveraging unlabeled data. Unfortunately, the lack of an ultimate task of interest has hindered progress in the field, as there is no established way to compare models and, often times, evaluation is based on mere visual inspection of samples drawn from such models. In this work, we aim at addressing this problem by introducing a new benchmark evaluation suite, dubbed GenEval. GenEval hosts a large array of distributions capturing many important properties of real datasets, yet in a controlled setting, such as lower intrinsic dimensionality, multi-modality, compositionality, independence and causal structure. Any model can be easily plugged for evaluation, provided it can generate samples. Our extensive evaluation suggests that different models have different strenghts, and that GenEval is a great tool to gain insights about how models and metrics work. We offer GenEval to the community 1 and believe that this benchmark will facilitate comparison and development of new generative models.

Marc'Aurelio Ranzato | Arthur Szlam | Anton Bakhtin

[1] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[2] David Lopez-Paz,et al. Revisiting Classifier Two-Sample Tests , 2016, ICLR.

[3] Matthias Bethge,et al. A note on the evaluation of generative models , 2015, ICLR.

[4] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[5] Xiaohua Zhai,et al. The GAN Landscape: Losses, Architectures, Regularization, and Normalization , 2018, ArXiv.

[6] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.

[7] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[8] Sebastian Nowozin,et al. Stabilizing Training of Generative Adversarial Networks through Regularization , 2017, NIPS.

[9] E. Parzen. On Estimation of a Probability Density Function and Mode , 1962 .

[10] Yingyu Liang,et al. Generalization and Equilibrium in Generative Adversarial Nets (GANs) , 2017, ICML.

[11] Donald Geman,et al. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .