Image Quality Assessment Using Contrastive Learning

We consider the problem of obtaining image quality representations in a self-supervised manner. We use prediction of distortion type and degree as an auxiliary task to learn features from an unlabeled image dataset containing a mixture of synthetic and realistic distortions. We then train a deep Convolutional Neural Network (CNN) using a contrastive pairwise objective to solve the auxiliary problem. We refer to the proposed training framework and resulting deep IQA model as the CONTRastive Image QUality Evaluator (CONTRIQUE). During evaluation, the CNN weights are frozen and a linear regressor maps the learned representations to quality scores in a No-Reference (NR) setting. We show through extensive experiments that CONTRIQUE achieves competitive performance when compared to state-of-the-art NR image quality models, even without any additional fine-tuning of the CNN backbone. The learned representations are highly robust and generalize well across images afflicted by either synthetic or authentic distortions. Our results suggest that powerful quality representations with perceptual relevance can be obtained without requiring large labeled subjective image quality datasets. The implementations used in this paper are available at https://github.com/pavancm/ CONTRIQUE.

[1]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[2]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[3]  Vlad Hosu,et al.  KADID-10k: A Large-scale Artificially Distorted IQA Database , 2019, 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX).

[4]  Alan C. Bovik,et al.  RAPIQUE: Rapid and Accurate Video Quality Prediction of User Generated Content , 2021, IEEE Open Journal of Signal Processing.

[5]  Kede Ma,et al.  Perceptual Quality Assessment of Smartphone Photography , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Alan C. Bovik,et al.  No-Reference Image Quality Assessment in the Spatial Domain , 2012, IEEE Transactions on Image Processing.

[7]  Eric C. Larson,et al.  Most apparent distortion: full-reference image quality assessment and the role of strategy , 2010, J. Electronic Imaging.

[8]  Praful Gupta,et al.  From Patches to Pictures (PaQ-2-PiQ): Mapping the Perceptual Space of Picture Quality , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Yu Zhu,et al.  Blindly Assess Image Quality in the Wild Guided by a Self-Adaptive Hyper Network , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Dietmar Saupe,et al.  DeepFL-IQA: Weak Supervision for Deep IQA Feature Learning , 2020, ArXiv.

[11]  Hong Cai,et al.  PieAPP: Perceptual Image-Error Assessment Through Pairwise Preference , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Naila Murray,et al.  AVA: A large-scale database for aesthetic visual analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Armand Joulin,et al.  Unsupervised Learning by Predicting Noise , 2017, ICML.

[15]  Jan Kautz,et al.  PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[17]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[18]  Joost van de Weijer,et al.  Exploiting Unlabeled Data in CNNs by Self-Supervised Learning to Rank , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[20]  Christophe Charrier,et al.  Blind Image Quality Assessment: A Natural Scene Statistics Approach in the DCT Domain , 2012, IEEE Transactions on Image Processing.

[21]  Alan C. Bovik,et al.  Predicting the Quality of Images Compressed After Distortion in Two Steps , 2018, IEEE Transactions on Image Processing.

[22]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[23]  Alan C. Bovik,et al.  Perceptual quality prediction on authentically distorted images using a bag of features approach , 2016, Journal of vision.

[24]  R Devon Hjelm,et al.  Learning Representations by Maximizing Mutual Information Across Views , 2019, NeurIPS.

[25]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[26]  Nikolay N. Ponomarenko,et al.  Image database TID2013: Peculiarities, results and perspectives , 2015, Signal Process. Image Commun..

[27]  Vasileios Mezaris,et al.  No-reference blur assessment in natural images using Fourier transform and spatial pyramids , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[28]  Alan C. Bovik,et al.  ST-GREED: Space-Time Generalized Entropic Differences for Frame Rate Dependent Video Quality Prediction , 2020, IEEE Transactions on Image Processing.

[29]  Alan Conrad Bovik,et al.  Dynamic Receptive Field Generation for Full-Reference Image Quality Assessment , 2020, IEEE Transactions on Image Processing.

[30]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Alan C. Bovik,et al.  A Statistical Evaluation of Recent Full Reference Image Quality Assessment Algorithms , 2006, IEEE Transactions on Image Processing.

[32]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[34]  D. Saupe,et al.  KonIQ-10k: An Ecologically Valid Database for Deep Learning of Blind Image Quality Assessment , 2019, IEEE Transactions on Image Processing.

[35]  Alan C. Bovik,et al.  Blind Image Quality Assessment: From Natural Scene Statistics to Perceptual Quality , 2011, IEEE Transactions on Image Processing.

[36]  Alan C. Bovik,et al.  Making a “Completely Blind” Image Quality Analyzer , 2013, IEEE Signal Processing Letters.

[37]  Thomas Brox,et al.  Discriminative Unsupervised Feature Learning with Convolutional Neural Networks , 2014, NIPS.

[38]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[39]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[40]  Alan C. Bovik,et al.  Massive Online Crowdsourced Study of Subjective and Objective Picture Quality , 2015, IEEE Transactions on Image Processing.

[41]  Eero P. Simoncelli,et al.  Image Quality Assessment: Unifying Structure and Texture Similarity , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Alexei A. Efros,et al.  Colorful Image Colorization , 2016, ECCV.

[43]  Zhou Wang,et al.  Blind Image Quality Assessment Using a Deep Bilinear Convolutional Neural Network , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[44]  Sanghoon Lee,et al.  Fully Deep Blind Image Quality Predictor , 2017, IEEE Journal of Selected Topics in Signal Processing.

[45]  Lei Zhang,et al.  A Probabilistic Quality Representation Approach to Deep Blind Image Quality Prediction , 2017, ArXiv.

[46]  Ce Liu,et al.  Supervised Contrastive Learning , 2020, NeurIPS.

[47]  Yong Liu,et al.  Blind Image Quality Assessment Based on High Order Statistics Aggregation , 2016, IEEE Transactions on Image Processing.

[48]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[49]  Praful Gupta,et al.  SpEED-QA: Spatial Efficient Entropic Differencing for Image and Video Quality , 2017, IEEE Signal Processing Letters.

[50]  Lei Zhang,et al.  Deep Convolutional Neural Models for Picture-Quality Prediction: Challenges and Solutions to Data-Driven Image Quality Assessment , 2017, IEEE Signal Processing Magazine.

[51]  Gregory Shakhnarovich,et al.  Colorization as a Proxy Task for Visual Understanding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[53]  Alan C. Bovik,et al.  Spatiotemporal Feature Integration and Model Fusion for Full Reference Video Quality Assessment , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[54]  Hongyu Li,et al.  VSI: A Visual Saliency-Induced Index for Perceptual Image Quality Assessment , 2014, IEEE Transactions on Image Processing.

[55]  David A. Shamma,et al.  YFCC100M , 2015, Commun. ACM.

[56]  David Zhang,et al.  FSIM: A Feature Similarity Index for Image Quality Assessment , 2011, IEEE Transactions on Image Processing.

[57]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Strategies From Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  David S. Doermann,et al.  Unsupervised feature learning framework for no-reference image quality assessment , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[59]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[60]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Yi Li,et al.  Convolutional Neural Networks for No-Reference Image Quality Assessment , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.