RAPIQUE: Rapid and Accurate Video Quality Prediction of User Generated Content

Blind or no-reference video quality assessment of user-generated content (UGC) has become a trending, challenging, heretofore unsolved problem. Accurate and efficient video quality predictors suitable for this content are thus in great demand to achieve more intelligent analysis and processing of UGC videos. Previous studies have shown that natural scene statistics and deep learning features are both sufficient to capture spatial distortions, which contribute to a significant aspect of UGC video quality issues. However, these models are either incapable or inefficient for predicting the quality of complex and diverse UGC videos in practical applications. Here we introduce an effective and efficient video quality model for UGC content, which we dub the Rapid and Accurate Video Quality Evaluator (RAPIQUE), which we show performs comparably to state-of-the-art (SOTA) models but with orders-of-magnitude faster runtime. RAPIQUE combines and leverages the advantages of both quality-aware scene statistics features and semantics-aware deep convolutional features, allowing us to design the first general and efficient spatial and temporal (space-time) bandpass statistics model for video quality modeling. Our experimental results on recent large-scale UGC video quality databases show that RAPIQUE delivers top performances on all the datasets at a considerably lower computational expense. We hope this work promotes and inspires further efforts towards practical modeling of video quality problems for potential real-time and low-latency applications.

[1]  J. Robson,et al.  Application of fourier analysis to the visibility of gratings , 1968, The Journal of physiology.

[2]  D. Ruderman The statistics of natural images , 1994 .

[3]  Alberto Leon-Garcia,et al.  Estimation of shape parameter for generalized Gaussian distributions in subband decompositions of video , 1995, IEEE Trans. Circuits Syst. Video Technol..

[4]  Zhou Wang,et al.  Blind measurement of blocking artifacts in images , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[5]  Stefan Winkler,et al.  A no-reference perceptual blur metric , 2002, Proceedings. International Conference on Image Processing.

[6]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[7]  Sabine Süsstrunk,et al.  Measuring colorfulness in natural images , 2003, IS&T/SPIE Electronic Imaging.

[8]  Alan C. Bovik,et al.  Image information and visual quality , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[10]  Eric Dubois,et al.  Fast and reliable structure-oriented video noise estimation , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  D. Bradley,et al.  Structure and function of visual area MT. , 2005, Annual review of neuroscience.

[12]  Alan C. Bovik,et al.  A Statistical Evaluation of Recent Full Reference Image Quality Assessment Algorithms , 2006, IEEE Transactions on Image Processing.

[13]  Jan P. Allebach,et al.  Measurement of ringing artifacts in JPEG images , 2006, Electronic Imaging.

[14]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[15]  Zhou Wang,et al.  Perceptual quality assessment of color images using adaptive signal representation , 2010, Electronic Imaging.

[16]  Christophe Charrier,et al.  A DCT Statistics-Based Blind Image Quality Index , 2010, IEEE Signal Processing Letters.

[17]  Rajiv Soundararajan,et al.  Study of Subjective and Objective Quality Assessment of Video , 2010, IEEE Transactions on Image Processing.

[18]  Alan C. Bovik,et al.  A Two-Step Framework for Constructing Blind Image Quality Indices , 2010, IEEE Signal Processing Letters.

[19]  Alan C. Bovik,et al.  Blind Image Quality Assessment: From Natural Scene Statistics to Perceptual Quality , 2011, IEEE Transactions on Image Processing.

[20]  David Zhang,et al.  FSIM: A Feature Similarity Index for Image Quality Assessment , 2011, IEEE Transactions on Image Processing.

[21]  Alan C. Bovik,et al.  No-Reference Image Quality Assessment in the Spatial Domain , 2012, IEEE Transactions on Image Processing.

[22]  Christophe Charrier,et al.  Blind Image Quality Assessment: A Natural Scene Statistics Approach in the DCT Domain , 2012, IEEE Transactions on Image Processing.

[23]  David S. Doermann,et al.  Unsupervised feature learning framework for no-reference image quality assessment , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Rajiv Soundararajan,et al.  Video Quality Assessment by Reduced Reference Spatio-Temporal Entropic Differencing , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[25]  Alan C. Bovik,et al.  Making a “Completely Blind” Image Quality Analyzer , 2013, IEEE Signal Processing Letters.

[26]  Nikolay N. Ponomarenko,et al.  Color image database TID2013: Peculiarities and preliminary results , 2013, European Workshop on Visual Information Processing (EUVIP).

[27]  Damon M. Chandler,et al.  No-reference image quality assessment based on log-derivative statistics of natural scenes , 2013, J. Electronic Imaging.

[28]  Alan C. Bovik,et al.  C-DIIVINE: No-reference image quality assessment based on local magnitude and phase statistics of natural scenes , 2014, Signal Process. Image Commun..

[29]  Yi Li,et al.  Convolutional Neural Networks for No-Reference Image Quality Assessment , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Damon M. Chandler,et al.  ViS3: an algorithm for video quality assessment via analysis of spatial and spatiotemporal slices , 2014, J. Electronic Imaging.

[31]  Lei Zhang,et al.  Gradient Magnitude Similarity Deviation: A Highly Efficient Perceptual Image Quality Index , 2013, IEEE Transactions on Image Processing.

[32]  Christophe Charrier,et al.  Blind Prediction of Natural Video Quality , 2014, IEEE Transactions on Image Processing.

[33]  Lei Zhang,et al.  Blind Image Quality Assessment Using Joint Statistics of Gradient Magnitude and Laplacian Features , 2014, IEEE Transactions on Image Processing.

[34]  Soo-Chang Pei,et al.  Image Quality Assessment Using Human Visual DOG Model Fused With Random Forest , 2015, IEEE Transactions on Image Processing.

[35]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Lei Zhang,et al.  A Feature-Enriched Completely Blind Image Quality Evaluator , 2015, IEEE Transactions on Image Processing.

[37]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[38]  Sebastian Bosse,et al.  A deep neural network for image quality assessment , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[39]  Alan C. Bovik,et al.  Massive Online Crowdsourced Study of Subjective and Objective Picture Quality , 2015, IEEE Transactions on Image Processing.

[40]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Yong Liu,et al.  Blind Image Quality Assessment Based on High Order Statistics Aggregation , 2016, IEEE Transactions on Image Processing.

[42]  Anil C. Kokaram,et al.  A perceptual visibility metric for banding artifacts , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[43]  Xuelong Li,et al.  Spatiotemporal Statistics for Video Quality Assessment , 2016, IEEE Transactions on Image Processing.

[44]  Alan C. Bovik,et al.  A Completely Blind Video Integrity Oracle , 2016, IEEE Transactions on Image Processing.

[45]  Konstantinos N. Plataniotis,et al.  Toward a No-Reference Image Quality Assessment Using Statistics of Perceptual Color Descriptors , 2016, IEEE Transactions on Image Processing.

[46]  Mikko Nuutinen,et al.  CVD2014—A Database for Evaluating No-Reference Video Quality Assessment Algorithms , 2016, IEEE Transactions on Image Processing.

[47]  Alan C. Bovik,et al.  Perceptual quality prediction on authentically distorted images using a bag of features approach , 2016, Journal of vision.

[48]  Alan C. Bovik,et al.  No-Reference Quality Assessment of Tone-Mapped HDR Pictures , 2017, IEEE Transactions on Image Processing.

[49]  Dietmar Saupe,et al.  The Konstanz natural video database (KoNViD-1k) , 2017, 2017 Ninth International Conference on Quality of Multimedia Experience (QoMEX).

[50]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[51]  Lei Zhang,et al.  Deep Convolutional Neural Models for Picture-Quality Prediction: Challenges and Solutions to Data-Driven Image Quality Assessment , 2017, IEEE Signal Processing Magazine.

[52]  Alan C. Bovik,et al.  In-Capture Mobile Video Distortions: A Study of Subjective Behavior and Objective Algorithms , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[53]  Dietmar Saupe,et al.  KonIQ-10k: Towards an ecologically valid and large-scale IQA database , 2018, ArXiv.

[54]  Zhengfang Duanmu,et al.  End-to-End Blind Quality Assessment of Compressed Videos Using Deep Neural Networks , 2018, ACM Multimedia.

[55]  Neil Birkbeck,et al.  Film Grain Synthesis for AV1 Video Codec , 2018, 2018 Data Compression Conference.

[56]  Jinwoo Kim,et al.  Deep Video Quality Assessor: From Spatio-Temporal Visual Sensitivity to a Convolutional Neural Aggregation Network , 2018, ECCV.

[57]  Alan C. Bovik,et al.  Spatio-Temporal Measures Of Naturalness , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[58]  Balu Adsumilli,et al.  YouTube UGC Dataset for Video Compression Research , 2019, 2019 IEEE 21st International Workshop on Multimedia Signal Processing (MMSP).

[59]  Xinbo Gao,et al.  Blind Video Quality Assessment With Weakly Supervised Learning and Resampling Strategy , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[60]  Alan Conrad Bovik,et al.  Large-Scale Study of Perceptual Video Quality , 2018, IEEE Transactions on Image Processing.

[61]  Ming Jiang,et al.  Quality Assessment of In-the-Wild Videos , 2019, ACM Multimedia.

[62]  Jari Korhonen,et al.  Two-Level Approach for No-Reference Consumer Video Quality Assessment , 2019, IEEE Transactions on Image Processing.

[63]  Alan C. Bovik,et al.  A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment , 2020, 2020 IEEE International Conference on Image Processing (ICIP).

[64]  Joshua Peter Ebenezer,et al.  No-Reference Video Quality Assessment Using Space-Time Chips , 2020, 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP).

[65]  Tianlong Chen,et al.  I Am Going MAD: Maximum Discrepancy Competition for Comparing Classifiers Adaptively , 2020, ICLR.

[66]  Kede Ma,et al.  Perceptual Quality Assessment of Smartphone Photography , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[67]  A. Bovik,et al.  Video Quality Model for Space-Time Resolution Adaptation , 2020, 2020 IEEE 4th International Conference on Image Processing, Applications and Systems (IPAS).

[68]  Alan C. Bovik,et al.  BBAND INDEX: A NO-REFERENCE BANDING ARTIFACT PREDICTOR , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[69]  Praful Gupta,et al.  From Patches to Pictures (PaQ-2-PiQ): Mapping the Perceptual Space of Picture Quality , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[70]  Sumohana S. Channappayya,et al.  No-Reference Video Quality Assessment Using Natural Spatiotemporal Scene Statistics , 2020, IEEE Transactions on Image Processing.

[71]  Neil Birkbeck,et al.  Video transcoding optimization based on input perceptual quality , 2020, Optical Engineering + Applications.

[72]  Dietmar Saupe,et al.  KonIQ-10k: An Ecologically Valid Database for Deep Learning of Blind Image Quality Assessment , 2019, IEEE Transactions on Image Processing.

[73]  Alan C. Bovik,et al.  Adaptive Debanding Filter , 2020, IEEE Signal Processing Letters.

[74]  Alan C. Bovik,et al.  Learning to Distort Images Using Generative Adversarial Networks , 2020, IEEE Signal Processing Letters.

[75]  A. Bovik,et al.  On the space-time statistics of motion pictures. , 2021, Journal of the Optical Society of America. A, Optics, image science, and vision.

[76]  Ding Liu,et al.  EnlightenGAN: Deep Light Enhancement Without Paired Supervision , 2019, IEEE Transactions on Image Processing.

[77]  A. Bovik,et al.  ProxIQA: A Proxy Approach to Perceptual Optimization of Learned Image Compression , 2019, IEEE Transactions on Image Processing.

[78]  Alan Bovik University of Texas at Austin,et al.  Patch-VQ: ‘Patching Up’ the Video Quality Problem , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[79]  Alan C. Bovik,et al.  Perceptual Video Quality Prediction Emphasizing Chroma Distortions , 2020, IEEE Transactions on Image Processing.

[80]  Alan C. Bovik,et al.  Predicting the Quality of Compressed Videos With Pre-Existing Distortions , 2020, IEEE Transactions on Image Processing.

[81]  Tingting Jiang,et al.  Unified Quality Assessment of in-the-Wild Videos with Mixed Datasets Training , 2020, Int. J. Comput. Vis..

[82]  Alan C. Bovik,et al.  UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated Content , 2020, IEEE Transactions on Image Processing.

[83]  Alan C. Bovik,et al.  ST-GREED: Space-Time Generalized Entropic Differences for Frame Rate Dependent Video Quality Prediction , 2020, IEEE Transactions on Image Processing.