论文信息 - A Non-Parametric Test to Detect Data-Copying in Generative Models - 字舞流文

A Non-Parametric Test to Detect Data-Copying in Generative Models

Detecting overfitting in generative models is an important challenge in machine learning. In this work, we formalize a form of overfitting that we call {\em{data-copying}} -- where the generative model memorizes and outputs training samples or small variations thereof. We provide a three sample non-parametric test for detecting data-copying that uses the training set, a separate sample from the target distribution, and a generated sample from the model, and study the performance of our test on several canonical models and datasets. For code \& examples, visit this https URL

Sanjoy Dasgupta | Kamalika Chaudhuri | Casey Meehan | S. Dasgupta | Kamalika Chaudhuri | Casey Meehan

[1] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[2] Julien Rabin,et al. Detecting Overfitting of Deep Generative Networks via Latent Recovery , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Kilian Q. Weinberger,et al. An empirical study on evaluation metrics of generative adversarial networks , 2018, ArXiv.

[4] Gunnar Rätsch,et al. Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs , 2017, ArXiv.

[5] Ruslan Salakhutdinov,et al. On the Quantitative Analysis of Decoder-Based Generative Models , 2016, ICLR.

[6] H. B. Mann,et al. On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[7] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[8] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.

[9] Alexander J. Smola,et al. Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy , 2016, ICLR.

[10] Jaakko Lehtinen,et al. Improved Precision and Recall Metric for Assessing Generative Models , 2019, NeurIPS.

[11] Olivier Bachem,et al. Assessing Generative Models via Precision and Recall , 2018, NeurIPS.

[12] David Lopez-Paz,et al. Revisiting Classifier Two-Sample Tests , 2016, ICLR.

[13] Yair Weiss,et al. On GANs and GMMs , 2018, NeurIPS.

[14] Matthias Bethge,et al. A note on the evaluation of generative models , 2015, ICLR.

[15] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[16] Jeff Donahue,et al. Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[17] Alexei A. Efros,et al. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18] Yoshua Bengio,et al. Mode Regularized Generative Adversarial Networks , 2016, ICLR.

[19] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[20] Jennifer G. Dy,et al. Can VAEs Generate Novel Examples? , 2018, ArXiv.

[21] Colin Raffel,et al. Towards GAN Benchmarks Which Require Generalization , 2020, ICLR.

[22] Arthur Gretton,et al. A Test of Relative Similarity For Model Selection in Generative Models , 2015, ICLR.