U-Shaped Transformer-Based 360-Degree No Reference Image Quality Assessment

Thanks to creative rendering and display techniques, 360-degree images can provide a more immersive and interactive experience for streaming users. However, such features make the perceptual characteristics of 360-degree images more complex than those of fixed-view images, making it impossible to achieve a comprehensive and accurate image quality assessment (IQA) task using a simple stack of pre-processing, post-processing, compression, and rendering tasks. In order to thoroughly learn global and local features in 360-degree images, reduce the complexity of multichannel neural network models and simplify the training process, this paper proposes a user-aware joint architecture and an efficient converter dedicated to 360-degree no-reference (NR) IQA. The input of the proposed method is a 360-degree cubic mapping projection (CMP) image. In addition, the proposed 360-degree NRIQA method includes a non-overlapping self-attentive selection module based on a dominant map and a feature extraction module based on a U-shaped transformer (U-former) to address perceptual region significance and projection distortion. The transformer-based architecture and the weighted averaging technique are jointly used to predict local perceptual quality. Experimental results obtained on widely used databases show that the proposed model outperforms other state-of-the-art methods in the case of NR 360-degree image quality assessment. In addition, cross-database evaluation and ablation studies demonstrate the intrinsic robustness and generalization of the proposed model.

[1]  Jenq-Neng Hwang,et al.  Tile-Based Panoramic Video Quality Assessment , 2022, IEEE Transactions on Broadcasting.

[2]  Zhan Ma,et al.  Viewport-Based Omnidirectional Video Quality Assessment: Database, Modeling and Inference , 2022, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Weiming Dong,et al.  Transformers in computational visual media: A survey , 2021, Computational Visual Media.

[4]  N. O'Connor,et al.  Rethinking 360° Image Visual Attention Modelling with Unsupervised Learning , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Faouzi Alaya Cheikh,et al.  Perceptually-Weighted Cnn For 360-Degree Image Quality Assessment Using Visual Scan-Path And Jnd , 2021, 2021 IEEE International Conference on Image Processing (ICIP).

[6]  Shenghua Gao,et al.  Spherical DNNs and Their Applications in 360$^\circ$∘ Images and Videos , 2021, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Hengyong Yu,et al.  TED-net: Convolution-free T2T Vision Transformer-based Encoder-decoder Dilation network for Low-dose CT Denoising , 2021, MLMI@MICCAI.

[8]  Jianmin Bao,et al.  Uformer: A General U-Shaped Transformer for Image Restoration , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Eakta Jain,et al.  A privacy-preserving approach to streaming eye-tracking data , 2021, IEEE Transactions on Visualization and Computer Graphics.

[10]  Fahad Shahbaz Khan,et al.  Transformers in Vision: A Survey , 2021, ACM Comput. Surv..

[11]  D. Tao,et al.  A Survey on Vision Transformer , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Xinlin Huang,et al.  Human-Perception-Oriented Pseudo Analog Video Transmissions With Deep Learning , 2020, IEEE Transactions on Vehicular Technology.

[13]  Guangtao Zhai,et al.  MC360IQA: A Multi-channel CNN for Blind 360-Degree Image Quality Assessment , 2020, IEEE Journal of Selected Topics in Signal Processing.

[14]  Larry Davis,et al.  A Weakly Supervised Adaptive Triplet Loss for Deep Metric Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[15]  Mai Xu,et al.  Assessing Visual Quality of Omnidirectional Videos , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[16]  Weisi Lin,et al.  Analysis of Distortion Distribution for Pooling in Image Quality Prediction , 2016, IEEE Transactions on Broadcasting.

[17]  Wenjun Zhang,et al.  Hybrid No-Reference Quality Metric for Singly and Multiply Distorted Images , 2014, IEEE Transactions on Broadcasting.

[18]  David Zhang,et al.  A comprehensive evaluation of full reference image quality assessment algorithms , 2012, 2012 19th IEEE International Conference on Image Processing.

[19]  Mei Yu,et al.  Quality Measurement for High Dynamic Range Omnidirectional Image Systems , 2021, IEEE Transactions on Instrumentation and Measurement.

[20]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.