IQMA Network: Image Quality Multi-scale Assessment Network

Image Quality Assessment (IQA), which aims to provide computational models for automatically predicting perceptual image quality, is an important computer vision task with many applications. In recent years, a variety of IQA methods have been proposed based on different metric de-signs, which measure the quality of images affected by various types of distortion. However, with the rapid development of Generative Adversarial Networks (GAN), a new challenge has been brought to the IQA community. Especially, the GAN-based Image Reconstruction (IR) methods overfit the traditional PSNR-based IQA methods by generating images with sharper edges and texture-like noises, leading the outputs to be similar to the reference image in appearance but with loss of details. In this paper, we propose a bilateral-branch multi-scale image quality estimation network, named IQMA network. The two branches are designed with Feature Pyramid Network (FPN)-like architecture, extracting multi-scale features for patches of the reference image and corresponding patches of the distorted image separately. Then features of the same scale from both branches are sent into several scale-specific feature fusion modules. Each module performs feature fusion and a novelly designed pooling operation for corresponding features. Then several score regression modules are used to learn a quality score for each scale. Finally, image scores for different scales are fused as the quality score of the image. IQMA network has achieved 1st place on the NTIRE 21 IQA public leaderboard and 2nd place on the NTIRE 21 IQA private leaderboard, and consistently outperforms existing state-of-the-art (SOTA) methods on LIVE and TID2013.

[1]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[2]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Haoyu Chen,et al.  Image Quality Assessment for Perceptual Image Restoration: A New Dataset, Benchmark and Metric , 2020, ArXiv.

[4]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Policies from Data , 2018, ArXiv.

[5]  Trevor Darrell,et al.  Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.

[6]  Zhou Wang,et al.  Why is image quality assessment so difficult? , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Guangtao Zhai,et al.  Perceptual image quality assessment: a survey , 2020, Science China Information Sciences.

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Hong Cai,et al.  PieAPP: Perceptual Image-Error Assessment Through Pairwise Preference , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Chih-Yuan Yang,et al.  Learning a No-Reference Quality Metric for Single-Image Super-Resolution , 2016, Comput. Vis. Image Underst..

[11]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Shuqiang Jiang,et al.  Multi-Scale Multi-View Deep Feature Aggregation for Food Recognition , 2020, IEEE Transactions on Image Processing.

[13]  Haoyu Chen,et al.  PIPAL: a Large-Scale Image Quality Assessment Dataset for Perceptual Image Restoration , 2020, ECCV.

[14]  Sebastian Bosse,et al.  Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment , 2016, IEEE Transactions on Image Processing.

[15]  Zhibo Chen,et al.  Deep Multi-Scale Features Learning for Distorted Image Quality Assessment , 2020, ArXiv.

[16]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[18]  Yochai Blau,et al.  The Perception-Distortion Tradeoff , 2017, CVPR.

[19]  Alan C. Bovik,et al.  Image information and visual quality , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[20]  Seetha Hari,et al.  Learning From Imbalanced Data , 2019, Advances in Computer and Electrical Engineering.

[21]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[22]  Edward H. Adelson,et al.  PYRAMID METHODS IN IMAGE PROCESSING. , 1984 .

[23]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[24]  Shiqi Wang,et al.  Comparison of Full-Reference Image Quality Models for Optimization of Image Processing Systems , 2021, International Journal of Computer Vision.

[25]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Djemel Ziou,et al.  Image Quality Metrics: PSNR vs. SSIM , 2010, 2010 20th International Conference on Pattern Recognition.

[27]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Gustavo de Veciana,et al.  An information fidelity criterion for image quality assessment using natural scene statistics , 2005, IEEE Transactions on Image Processing.

[29]  Jitendra Malik,et al.  Beyond Skip Connections: Top-Down Modulation for Object Detection , 2016, ArXiv.

[30]  Nikolay N. Ponomarenko,et al.  Image database TID2013: Peculiarities, results and perspectives , 2015, Signal Process. Image Commun..