An End-to-End No-Reference Video Quality Assessment Method With Hierarchical Spatiotemporal Feature Representation

In this paper, we propose a deep neural network-based no-reference (NR) video quality assessment (VQA) method with spatiotemporal feature fusion and hierarchical information integration to evaluate the perceptual quality of videos. First, a feature extraction model is proposed by using 2D and 3D convolutional layers to gradually extract spatiotemporal features from raw video clips. Second, we design a hierarchical branching network to fuse multiframe features, and the feature vectors at each hierarchical level are comprehensively considered during the process of network optimization. Finally, these two modules and quality regression are synthesized into an end-to-end architecture. Experimental results obtained on benchmark VQA databases demonstrate the superiority of our method over other state-of-the-art algorithms. The source code is available online.1

[1]  W. Dong,et al.  Unsupervised Curriculum Domain Adaptation for No-Reference Video Quality Assessment , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Xiongkuo Min,et al.  Attention Based Network For No-Reference UGC Video Quality Assessment , 2021, 2021 IEEE International Conference on Image Processing (ICIP).

[3]  Sriram Sethuraman,et al.  ChipQA: No-Reference Video Quality Prediction via Space-Time Chips , 2021, IEEE Transactions on Image Processing.

[4]  Wassim Hamidouche,et al.  Perceptual Quality Assessment of HEVC and VVC Standards for 8K Video , 2021, IEEE Transactions on Broadcasting.

[5]  Yuan-Gen Wang,et al.  Starvqa: Space-Time Attention for Video Quality Assessment , 2021, 2022 IEEE International Conference on Image Processing (ICIP).

[6]  Guangtao Zhai,et al.  Subjective and Objective Quality Assessment of Compressed Screen Content Videos , 2021, IEEE Transactions on Broadcasting.

[7]  Alan C. Bovik,et al.  Regression or classification? New methods to evaluate no-reference picture and video quality models , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Alan C. Bovik,et al.  RAPIQUE: Rapid and Accurate Video Quality Prediction of User Generated Content , 2021, IEEE Open Journal of Signal Processing.

[9]  Alan Bovik University of Texas at Austin,et al.  Patch-VQ: ‘Patching Up’ the Video Quality Problem , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Huan Yang,et al.  Reduced Reference Perceptual Quality Model With Application to Rate Control for Video-Based Point Cloud Compression , 2020, IEEE Transactions on Image Processing.

[11]  Tingting Jiang,et al.  Unified Quality Assessment of in-the-Wild Videos with Mixed Datasets Training , 2020, Int. J. Comput. Vis..

[12]  Junyong You,et al.  Blind Natural Video Quality Prediction via Statistical Temporal Features and Deep Spatial Features , 2020, ACM Multimedia.

[13]  Morteza Khademi,et al.  No-Reference Video Quality Assessment Based on Visual Memory Modeling , 2020, IEEE Transactions on Broadcasting.

[14]  Jianqing Zhu,et al.  Screen Content Video Quality Assessment: Subjective and Objective Study , 2020, IEEE Transactions on Image Processing.

[15]  Guangming Shi,et al.  End-to-End Blind Image Quality Prediction With Cascaded Deep Neural Network , 2020, IEEE Transactions on Image Processing.

[16]  Yu Zhu,et al.  Blindly Assess Image Quality in the Wild Guided by a Self-Adaptive Hyper Network , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Kede Ma,et al.  Perceptual Quality Assessment of Smartphone Photography , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Alan C. Bovik,et al.  UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated Content , 2020, IEEE Transactions on Image Processing.

[19]  Shiqi Wang,et al.  Image Quality Assessment: Unifying Structure and Texture Similarity , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Sumohana S. Channappayya,et al.  No-Reference Video Quality Assessment Using Natural Spatiotemporal Scene Statistics , 2020, IEEE Transactions on Image Processing.

[21]  Alan C. Bovik,et al.  Predicting the Quality of Compressed Videos With Pre-Existing Distortions , 2020, IEEE Transactions on Image Processing.

[22]  Guizhong Liu,et al.  Bitrate-Based No-Reference Video Quality Assessment Combining the Visual Perception of Video Contents , 2019, IEEE Transactions on Broadcasting.

[23]  Ming Jiang,et al.  Quality Assessment of In-the-Wild Videos , 2019, ACM Multimedia.

[24]  Xinbo Gao,et al.  Blind Video Quality Assessment With Weakly Supervised Learning and Resampling Strategy , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[25]  Jari Korhonen,et al.  Two-Level Approach for No-Reference Consumer Video Quality Assessment , 2019, IEEE Transactions on Image Processing.

[26]  Xiaoming Tao,et al.  Viewport Proposal CNN for 360° Video Quality Assessment , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Zhengfang Duanmu,et al.  End-to-End Blind Quality Assessment of Compressed Videos Using Deep Neural Networks , 2018, ACM Multimedia.

[28]  Dietmar Saupe,et al.  Spatiotemporal Feature Combination Model for No-Reference Video Quality Assessment , 2018, 2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX).

[29]  Alan Conrad Bovik,et al.  Large-Scale Study of Perceptual Video Quality , 2018, IEEE Transactions on Image Processing.

[30]  Bin Jiang,et al.  No Reference Quality Assessment of Stereo Video Based on Saliency and Sparsity , 2018, IEEE Transactions on Broadcasting.

[31]  Baihua Li,et al.  A no-reference optical flow-based quality evaluator for stereoscopic videos in curvelet domain , 2017, Inf. Sci..

[32]  Sanghoon Lee,et al.  Deep Learning of Human Visual Sensitivity in Image Quality Assessment Framework , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Dietmar Saupe,et al.  The Konstanz natural video database (KoNViD-1k) , 2017, 2017 Ninth International Conference on Quality of Multimedia Experience (QoMEX).

[34]  Antonio Liotta,et al.  Deep Learning for Quality Assessment in Live Video Streaming , 2017, IEEE Signal Processing Letters.

[35]  Ke Gu,et al.  Perceptual Reduced-Reference Visual Quality Assessment for Contrast Alteration , 2017, IEEE Transactions on Broadcasting.

[36]  Jacob Søgaard,et al.  No-reference pixel based video quality assessment for HEVC decoded video , 2017, J. Vis. Commun. Image Represent..

[37]  Lai-Man Po,et al.  No-Reference Video Quality Assessment With 3D Shearlet Transform and Convolutional Neural Networks , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[38]  Xuelong Li,et al.  Spatiotemporal Statistics for Video Quality Assessment , 2016, IEEE Transactions on Image Processing.

[39]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  David A. Shamma,et al.  YFCC100M , 2015, Commun. ACM.

[41]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[42]  Sumohana S. Channappayya,et al.  A perceptually motivated no-reference video quality assessment algorithm for packet loss artifacts , 2014, 2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX).

[43]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[44]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[45]  Christophe Charrier,et al.  Blind Prediction of Natural Video Quality , 2014, IEEE Transactions on Image Processing.

[46]  Alan C. Bovik,et al.  Making a “Completely Blind” Image Quality Analyzer , 2013, IEEE Signal Processing Letters.

[47]  Alan C. Bovik,et al.  No-Reference Image Quality Assessment in the Spatial Domain , 2012, IEEE Transactions on Image Processing.

[48]  Wen Gao,et al.  Novel Spatio-Temporal Structural Information Based Video Quality Metric , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[49]  Alan C. Bovik,et al.  Motion Tuned Spatio-Temporal Quality Assessment of Natural Videos , 2010, IEEE Transactions on Image Processing.

[50]  Rosario El-Feghali,et al.  Video Quality Metric for Bit Rate Control via Joint Adjustment of Quantization and Frame Rate , 2007, IEEE Transactions on Broadcasting.

[51]  Shan Liu,et al.  Semantic Information Oriented No-Reference Video Quality Assessment , 2021, IEEE Signal Processing Letters.

[52]  Sumohana S. Channappayya,et al.  Predicting Spatio-Temporal Entropic Differences for Robust No Reference Video Quality Assessment , 2021, IEEE Signal Processing Letters.

[53]  Mikko Nuutinen,et al.  CVD2014—A Database for Evaluating No-Reference Video Quality Assessment Algorithms , 2016, IEEE Transactions on Image Processing.

[54]  Alan C. Bovik,et al.  A Completely Blind Video Integrity Oracle , 2016, IEEE Transactions on Image Processing.