Pros and Cons of GAN Evaluation Measures

Generative models, in particular generative adverserial networks (GANs), have received a lot of attention recently. A number of GAN variants have been proposed and have been utilized in many applications. Despite large strides in terms of theoretical progress, evaluating and comparing GANs remains a daunting task. While several measures have been introduced, as of yet, there is no consensus as to which measure best captures strengths and limitations of models and should be used for fair model comparison. As in other areas of computer vision and machine learning, it is critical to settle on one or few good measures to steer the progress in this field. In this paper, I review and critically discuss more than 19 quantitative and 4 qualitative measures for evaluating generative models with a particular emphasis on GAN-derived models.

[1]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Ashish Khetan,et al.  PacGAN: The Power of Two Samples in Generative Adversarial Networks , 2017, IEEE Journal on Selected Areas in Information Theory.

[3]  Kilian Q. Weinberger,et al.  An empirical study on evaluation metrics of generative adversarial networks , 2018, ArXiv.

[4]  Cordelia Schmid,et al.  How good is my GAN? , 2018, ECCV.

[5]  Huchuan Lu,et al.  Statistics of Deep Generated Images , 2017, ArXiv.

[6]  Valentin Khrulkov,et al.  Geometry Score: A Method For Comparing Generative Adversarial Networks , 2018, ICML.

[7]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[8]  Song-Chun Zhu,et al.  Statistical Modeling and Conceptualization of Visual Patterns , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Antonio Torralba,et al.  Generating Videos with Scene Dynamics , 2016, NIPS.

[10]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[11]  Jiwen Lu,et al.  An Improved Evaluation Framework for Generative Adversarial Networks , 2018, ArXiv.

[12]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[13]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[14]  Matthias Bethge,et al.  A note on the evaluation of generative models , 2015, ICLR.

[15]  John E. Hopcroft,et al.  Stacked Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Yuanzhen Li,et al.  Measuring visual clutter. , 2007, Journal of vision.

[17]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[18]  Eero P. Simoncelli,et al.  Maximum differentiation (MAD) competition: a methodology for comparing computational models of perceptual quantities. , 2008, Journal of vision.

[19]  Yong Yu,et al.  Activation Maximization Generative Adversarial Nets , 2017 .

[20]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[22]  Joost van de Weijer,et al.  Ensembles of Generative Adversarial Networks , 2016, ArXiv.

[23]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[24]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[25]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[26]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[27]  Hua Wang,et al.  Auto-painter: Cartoon Image Generation from Sketch by Using Conditional Generative Adversarial Networks , 2017, ArXiv.

[28]  Jon Driver,et al.  Preserved figure-ground segregation and symmetry perception in visual neglect , 1992, Nature.

[29]  Sridhar Mahadevan,et al.  Generative Multi-Adversarial Networks , 2016, ICLR.

[30]  Zoubin Ghahramani,et al.  Training generative neural networks via Maximum Mean Discrepancy optimization , 2015, UAI.

[31]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  D. Ruderman The statistics of natural images , 1994 .

[33]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[34]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[35]  Robert Pless,et al.  Deep Feature Interpolation for Image Content Changes , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[37]  Sitao Xiang,et al.  On the Effects of Batch and Weight Normalization in Generative Adversarial Networks , 2017 .

[38]  Eero P. Simoncelli,et al.  Natural image statistics and neural representation. , 2001, Annual review of neuroscience.

[39]  Arnold W. M. Smeulders,et al.  A Biologically Plausible Model for Rapid Natural Scene Identification , 2009, NIPS.

[40]  R. Fortet,et al.  Convergence de la répartition empirique vers la répartition théorique , 1953 .

[41]  G. J. Burton,et al.  Color and spatial structure in natural scenes. , 1987, Applied optics.

[42]  Antonio Torralba,et al.  Statistics of natural image categories , 2003, Network.

[43]  Ruslan Salakhutdinov,et al.  On the Quantitative Analysis of Decoder-Based Generative Models , 2016, ICLR.

[44]  Ravi Kiran Sarvadevabhatla,et al.  DeLiGAN: Generative Adversarial Networks for Diverse and Limited Data , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Richard S. Zemel,et al.  Generative Moment Matching Networks , 2015, ICML.

[46]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Minh N. Do,et al.  Semantic Image Inpainting with Deep Generative Models , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[49]  Andrew M. Dai,et al.  Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence At Every Step , 2017, ICLR.

[50]  Noel E. O'Connor,et al.  SalGAN: Visual Saliency Prediction with Generative Adversarial Networks , 2017, ArXiv.

[51]  Matthias Bethge,et al.  How Sensitive Is the Human Visual System to the Local Statistics of Natural Images? , 2013, PLoS Comput. Biol..

[52]  Olivier Bachem,et al.  Assessing Generative Models via Precision and Recall , 2018, NeurIPS.

[53]  Vishal M. Patel,et al.  Image De-Raining Using a Conditional Generative Adversarial Network , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[54]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[55]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[56]  He Ma,et al.  Quantitatively Evaluating GANs With Divergences Proposed for Training , 2018, ICLR.

[57]  Mario Lucic,et al.  Are GANs Created Equal? A Large-Scale Study , 2017, NeurIPS.

[58]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Abel G. Oliva,et al.  Gist of a scene , 2005 .

[60]  Alan C. Bovik,et al.  Image information and visual quality , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[61]  Bolei Zhou,et al.  Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[62]  Yang Song,et al.  Decoupled Learning for Conditional Adversarial Networks , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[63]  Chi-Keung Tang,et al.  Sketch-to-Image Generation Using Deep Contextual Completion , 2017, ArXiv.

[64]  Eero P. Simoncelli,et al.  On Advances in Statistical Modeling of Natural Images , 2004, Journal of Mathematical Imaging and Vision.

[65]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Xueyan Jiang,et al.  Metrics for Deep Generative Models , 2017, AISTATS.

[67]  Yanxi Liu,et al.  Beyond Planar Symmetry: Modeling Human Perception of Reflection and Rotation Symmetries in the Wild , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[68]  W. Geisler Visual perception and the statistical properties of natural scenes. , 2008, Annual review of psychology.

[69]  Alexander A. Alemi,et al.  An Information-Theoretic Analysis of Deep Latent-Variable Models , 2017, ArXiv.

[70]  David Filliat,et al.  Evaluation of generative networks through their data augmentation capacity , 2018 .

[71]  David Lopez-Paz,et al.  Revisiting Classifier Two-Sample Tests , 2016, ICLR.

[72]  Bernhard Schölkopf,et al.  Kernel Mean Embedding of Distributions: A Review and Beyonds , 2016, Found. Trends Mach. Learn..

[73]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[74]  Bernhard Schölkopf,et al.  AdaGAN: Boosting Generative Models , 2017, NIPS.

[75]  Rishi Sharma,et al.  A Note on the Inception Score , 2018, ArXiv.

[76]  Stephen E. Fienberg,et al.  Testing Statistical Hypotheses , 2005 .

[77]  Renjie Liao,et al.  Learning to generate images with perceptual similarity metrics , 2015, 2017 IEEE International Conference on Image Processing (ICIP).

[78]  Chris Donahue,et al.  Semantically Decomposing the Latent Spaces of Generative Adversarial Networks , 2017, ICLR.

[79]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[80]  Yann LeCun,et al.  Disentangling factors of variation in deep representation using adversarial training , 2016, NIPS.

[81]  Lucas Theis,et al.  Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[82]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[83]  Yiming Yang,et al.  MMD GAN: Towards Deeper Understanding of Moment Matching Network , 2017, NIPS.

[84]  C. R. Carlson,et al.  Image Descriptors for Displays , 1977 .

[85]  David Mumford,et al.  Occlusion Models for Natural Images: A Statistical Study of a Scale-Invariant Dead Leaves Model , 2004, International Journal of Computer Vision.

[86]  Alexander J. Smola,et al.  Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy , 2016, ICLR.

[87]  A. Bovik,et al.  A universal image quality index , 2002, IEEE Signal Processing Letters.

[88]  Dhruv Batra,et al.  LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation , 2016, ICLR.

[89]  Ali Borji,et al.  Cross-View Image Synthesis Using Conditional GANs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[90]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[91]  Jun Wang,et al.  Inception Score, Label Smoothing, Gradient Vanishing and -log(D(x)) Alternative , 2017, ArXiv.

[92]  Alan C. Bovik,et al.  Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures , 2009, IEEE Signal Processing Magazine.

[93]  Xiaohua Zhai,et al.  The GAN Landscape: Losses, Architectures, Regularization, and Normalization , 2018, ArXiv.

[94]  Yair Weiss,et al.  On GANs and GMMs , 2018, NeurIPS.

[95]  Yoshua Bengio,et al.  Mode Regularized Generative Adversarial Networks , 2016, ICLR.

[96]  Alexei A. Efros,et al.  Colorful Image Colorization , 2016, ECCV.

[97]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[98]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[99]  Mingyan Liu,et al.  Generating Adversarial Examples with Adversarial Networks , 2018, IJCAI.

[100]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[101]  Vishnu Naresh Boddeti,et al.  Gang of GANs: Generative Adversarial Networks with Maximum Margin Ranking , 2017, ArXiv.

[102]  Martin J. Wainwright,et al.  Scale Mixtures of Gaussians and the Statistics of Natural Images , 1999, NIPS.

[103]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[104]  Charles A. Sutton,et al.  VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning , 2017, NIPS.

[105]  David Berthelot,et al.  BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[106]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[107]  Kevin Murphy,et al.  Generative Models of Visually Grounded Imagination , 2017, ICLR.

[108]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[109]  Subarna Tripathi,et al.  Precise Recovery of Latent Vectors from Generative Adversarial Networks , 2017, ICLR.

[110]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[111]  Alexei A. Efros,et al.  Toward Multimodal Image-to-Image Translation , 2017, NIPS.

[112]  Yi Zhang,et al.  Do GANs actually learn the distribution? An empirical study , 2017, ArXiv.

[113]  Arthur Gretton,et al.  Demystifying MMD GANs , 2018, ICLR.

[114]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[115]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[116]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[117]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[118]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[119]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[120]  Ian J. Goodfellow,et al.  Skill Rating for Generative Models , 2018, ArXiv.

[121]  Ferenc Huszar,et al.  How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? , 2015, ArXiv.

[122]  Arthur Gretton,et al.  A Test of Relative Similarity For Model Selection in Generative Models , 2015, ICLR.

[123]  Tom White,et al.  Sampling Generative Networks: Notes on a Few Effective Techniques , 2016, ArXiv.

[124]  Hui Jiang,et al.  Generating images with recurrent adversarial networks , 2016, ArXiv.

[125]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[126]  Yingli Tian,et al.  GAN Quality Index (GQI) By GAN-induced Classifier , 2018 .

[127]  Aleksander Madry,et al.  A Classification-Based Study of Covariate Shift in GAN Distributions , 2017, ICML.

[128]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[129]  Tom White,et al.  Sampling Generative Networks: Notes on a Few Effective Techniques , 2016, ArXiv.

[130]  Radford M. Neal Annealed importance sampling , 1998, Stat. Comput..

[131]  Thomas Serre,et al.  A feedforward architecture accounts for rapid categorization , 2007, Proceedings of the National Academy of Sciences.