Assessing a Single Image in Reference-Guided Image Synthesis

Assessing the performance of Generative Adversarial Networks (GANs) has been an important topic due to its practical significance. Although several evaluation metrics have been proposed, they generally assess the quality of the whole generated image distribution. For Reference-guided Image Synthesis (RIS) tasks, i.e., rendering a source image in the style of another reference image, where assessing the quality of a single generated image is crucial, these metrics are not applicable. In this paper, we propose a general learning-based framework, Reference-guided Image Synthesis Assessment (RISA) to quantitatively evaluate the quality of a single generated image. Notably, the training of RISA does not require human annotations. In specific, the training data for RISA are acquired by the intermediate models from the training procedure in RIS, and weakly annotated by the number of models’ iterations, based on the positive correlation between image quality and iterations. As this annotation is too coarse as a supervision signal, we introduce two techniques: 1) a pixelwise interpolation scheme to refine the coarse labels, and 2) multiple binary classifiers to replace a naı̈ve regressor. In addition, an unsupervised contrastive loss is introduced to effectively capture the style similarity between a generated image and its reference image. Empirical results on various datasets demonstrate that RISA is highly consistent with human preference and transfers well across models.

[1]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Dong Chen,et al.  GIQA: Generated Image Quality Assessment , 2020, ECCV.

[3]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[4]  Mohammed Ghanbari,et al.  Scope of validity of PSNR in image/video quality assessment , 2008 .

[5]  Peyman Milanfar,et al.  NIMA: Neural Image Assessment , 2017, IEEE Transactions on Image Processing.

[6]  Maneesh Kumar Singh,et al.  DRIT++: Diverse Image-to-Image Translation via Disentangled Representations , 2019, International Journal of Computer Vision.

[7]  Alan C. Bovik,et al.  A Two-Step Framework for Constructing Blind Image Quality Indices , 2010, IEEE Signal Processing Letters.

[8]  Yoshua Bengio,et al.  Mode Regularized Generative Adversarial Networks , 2016, ICLR.

[9]  Tali Dekel,et al.  SinGAN: Learning a Generative Model From a Single Natural Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[11]  Leon A. Gatys,et al.  Texture Synthesis Using Convolutional Neural Networks , 2015, NIPS.

[12]  Siwei Ma,et al.  Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Joost van de Weijer,et al.  RankIQA: Learning from Rankings for No-Reference Image Quality Assessment , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Sivaraman Balakrishnan,et al.  Optimal kernel choice for large-scale two-sample tests , 2012, NIPS.

[15]  Yuan Zhang,et al.  Blind Predicting Similar Quality Map for Image Quality Assessment , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[17]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Christine Fernandez-Maloigne,et al.  A grouplet-based reduced reference image quality assessment , 2009, 2009 International Workshop on Quality of Multimedia Experience.

[21]  Alan C. Bovik,et al.  Blind Image Quality Assessment: From Natural Scene Statistics to Perceptual Quality , 2011, IEEE Transactions on Image Processing.

[22]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[23]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[24]  Sebastian Bosse,et al.  A deep neural network for image quality assessment , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[25]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[26]  Alexei A. Efros,et al.  Swapping Autoencoder for Deep Image Manipulation , 2020, NeurIPS.

[27]  Kwan-Yee Lin,et al.  Hallucinated-IQA: No-Reference Image Quality Assessment via Adversarial Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Zhou Wang,et al.  Blind Image Quality Assessment Using a Deep Bilinear Convolutional Neural Network , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[29]  Yong Man Ro,et al.  VR IQA NET: Deep Virtual Reality Image Quality Assessment Using Adversarial Learning , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[30]  Guangtao Zhai,et al.  Uncertainty-Aware Blind Image Quality Assessment in the Laboratory and Wild , 2020, IEEE Transactions on Image Processing.

[31]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Yizhou Wang,et al.  RAN4IQA: Restorative Adversarial Nets for No-Reference Image Quality Assessment , 2017, AAAI.

[33]  Soo-Chang Pei,et al.  Age estimation via fusion of multiple binary age grouping systems , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[34]  David Zhang,et al.  FSIM: A Feature Similarity Index for Image Quality Assessment , 2011, IEEE Transactions on Image Processing.

[35]  Jan Kautz,et al.  Multimodal Unsupervised Image-to-Image Translation , 2018, ECCV.

[36]  Alan C. Bovik,et al.  Making a “Completely Blind” Image Quality Analyzer , 2013, IEEE Signal Processing Letters.

[37]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[38]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[39]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[40]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[41]  Kilian Q. Weinberger,et al.  An empirical study on evaluation metrics of generative adversarial networks , 2018, ArXiv.

[42]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[43]  Jung-Woo Ha,et al.  StarGAN v2: Diverse Image Synthesis for Multiple Domains , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  M. Ghanbari,et al.  Reduced-reference picture quality estimation by using local harmonic amplitude information , .