Quality Assessment of In-the-Wild Videos

Quality assessment of in-the-wild videos is a challenging problem because of the absence of reference videos and shooting distortions. Knowledge of the human visual system can help establish methods for objective quality assessment of in-the-wild videos. In this work, we show two eminent effects of the human visual system, namely, content-dependency and temporal-memory effects, could be used for this purpose. We propose an objective no-reference video quality assessment method by integrating both effects into a deep neural network. For content-dependency, we extract features from a pre-trained image classification neural network for its inherent content-aware property. For temporal-memory effects, long-term dependencies, especially the temporal hysteresis, are integrated into the network with a gated recurrent unit and a subjectively-inspired temporal pooling layer. To validate the performance of our method, experiments are conducted on three publicly available in-the-wild video quality assessment databases: KoNViD-1k, CVD2014, and LIVE-Qualcomm, respectively. Experimental results demonstrate that our proposed method outperforms five state-of-the-art methods by a large margin, specifically, 12.39%, 15.71%, 15.45%, and 18.09% overall performance improvements over the second-best method VBLIINDS, in terms of SROCC, KROCC, PLCC and RMSE, respectively. Moreover, the ablation study verifies the crucial role of both the content-aware features and the modeling of temporal-memory effects. The PyTorch implementation of our method is released at https://github.com/lidq92/VSFA.

[1]  David S. Doermann,et al.  Unsupervised feature learning framework for no-reference image quality assessment , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Chuang Gan,et al.  End-to-End Learning of Motion Representation for Video Understanding , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  María Pérez-Ortiz,et al.  Psychometric scaling of TID2013 dataset , 2018, 2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX).

[4]  Xuelong Li,et al.  Spatiotemporal Statistics for Video Quality Assessment , 2016, IEEE Transactions on Image Processing.

[5]  Xinbo Gao,et al.  Blind Video Quality Assessment With Weakly Supervised Learning and Resampling Strategy , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Weisi Lin,et al.  Which Has Better Visual Quality: The Clear Blue Sky or a Blurry Animal? , 2019, IEEE Transactions on Multimedia.

[7]  Alan Conrad Bovik,et al.  Large-Scale Study of Perceptual Video Quality , 2018, IEEE Transactions on Image Processing.

[8]  Zhengfang Duanmu,et al.  End-to-End Blind Quality Assessment of Compressed Videos Using Deep Neural Networks , 2018, ACM Multimedia.

[9]  Alan C. Bovik,et al.  Video quality assessment accounting for temporal visual masking of local flicker , 2018, Signal Process. Image Commun..

[10]  David S. Doermann,et al.  No-reference video quality assessment via feature learning , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[11]  Dietmar Saupe,et al.  Spatiotemporal Feature Combination Model for No-Reference Video Quality Assessment , 2018, 2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX).

[12]  Xinbo Gao,et al.  A spatiotemporal model of video quality assessment via 3D gradient differencing , 2019, Inf. Sci..

[13]  Touradj Ebrahimi,et al.  Attention Driven Foveated Video Quality Assessment , 2014, IEEE Transactions on Image Processing.

[14]  Alan C. Bovik,et al.  Temporal hysteresis model of time varying subjective video quality , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Snjezana Rimac-Drlje,et al.  Influence of temporal pooling method on the objective video quality evaluation , 2009, 2009 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting.

[16]  Yun Zhu,et al.  Blind video quality assessment based on spatio-temporal internal generative mechanism , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[17]  Alan C. Bovik,et al.  No-Reference Image Quality Assessment in the Spatial Domain , 2012, IEEE Transactions on Image Processing.

[18]  Xinbo Gao,et al.  Objective Video Quality Assessment Combining Transfer Learning With CNN , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Phuoc Tran-Gia,et al.  A Survey on Quality of Experience of HTTP Adaptive Streaming , 2015, IEEE Communications Surveys & Tutorials.

[20]  Sumohana S. Channappayya,et al.  An optical flow-based no-reference video quality assessment algorithm , 2016, ICIP.

[21]  Mikko Nuutinen,et al.  CVD2014—A Database for Evaluating No-Reference Video Quality Assessment Algorithms , 2016, IEEE Transactions on Image Processing.

[22]  Deep Medhi,et al.  Measurement of Quality of Experience of Video-on-Demand Services: A Survey , 2016, IEEE Communications Surveys & Tutorials.

[23]  Xin Jin,et al.  VideoSet: A large-scale compressed video quality dataset based on JND measurement , 2017, J. Vis. Commun. Image Represent..

[24]  Damon M. Chandler,et al.  ViS3: an algorithm for video quality assessment via analysis of spatial and spatiotemporal slices , 2014, J. Electronic Imaging.

[25]  Anil C. Kokaram,et al.  A no-reference video quality predictor for compression and scaling artifacts , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[26]  Wei Zhang,et al.  Study of Saliency in Objective Video Quality Assessment , 2017, IEEE Transactions on Image Processing.

[27]  Alan C. Bovik,et al.  Video Quality Pooling Adaptive to Perceptual Distortion Severity , 2013, IEEE Transactions on Image Processing.

[28]  Dietmar Saupe,et al.  Empirical evaluation of no-reference VQA methods on a natural video quality database , 2017, 2017 Ninth International Conference on Quality of Multimedia Experience (QoMEX).

[29]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Christophe Charrier,et al.  Blind Prediction of Natural Video Quality , 2014, IEEE Transactions on Image Processing.

[31]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[32]  Alan C. Bovik,et al.  Motion Tuned Spatio-Temporal Quality Assessment of Natural Videos , 2010, IEEE Transactions on Image Processing.

[33]  Jinwoo Kim,et al.  Deep Video Quality Assessor: From Spatio-Temporal Visual Sensitivity to a Convolutional Neural Aggregation Network , 2018, ECCV.

[34]  Judith Redi,et al.  Semantic-aware blind image quality assessment , 2018, Signal Process. Image Commun..

[35]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[36]  Guangming Shi,et al.  Blind image quality assessment with hierarchy: Degradation from local structure to deep semantics , 2019, J. Vis. Commun. Image Represent..

[37]  Lai-Man Po,et al.  No-Reference Video Quality Assessment With 3D Shearlet Transform and Convolutional Neural Networks , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[38]  Sumohana S. Channappayya,et al.  An optical flow-based no-reference video quality assessment algorithm , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[39]  Alan Conrad Bovik,et al.  Study of Temporal Effects on Subjective Video Quality of Experience , 2017, IEEE Transactions on Image Processing.

[40]  Dubravko Culibrk,et al.  Evaluating the Role of Content in Subjective Video Quality Assessment , 2014, TheScientificWorldJournal.

[41]  Sophie Triantaphillidou,et al.  Image quality comparison between JPEG and JPEG2000. II. Scene dependency, scene analysis, and classification , 2007 .

[42]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[43]  Martin Slanina,et al.  “To pool or not to pool”: A comparison of temporal pooling methods for HTTP adaptive video streaming , 2013, 2013 Fifth International Workshop on Quality of Multimedia Experience (QoMEX).

[44]  Dietmar Saupe,et al.  The Konstanz natural video database (KoNViD-1k) , 2017, 2017 Ninth International Conference on Quality of Multimedia Experience (QoMEX).

[45]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[46]  Lina J. Karam,et al.  Understanding how image quality affects deep neural networks , 2016, 2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX).

[47]  Zhengfang Duanmu,et al.  Quality-of-Experience of Adaptive Video Streaming: Exploring the Space of Adaptations , 2017, ACM Multimedia.

[48]  Ljiljana Platisa,et al.  Content-aware objective video quality assessment , 2016, J. Electronic Imaging.

[49]  Zhou Wang,et al.  Video quality assessment based on structural distortion measurement , 2004, Signal Process. Image Commun..

[50]  Alan C. Bovik,et al.  A Completely Blind Video Integrity Oracle , 2016, IEEE Transactions on Image Processing.

[51]  Mikko Nuutinen,et al.  CID2013: A Database for Evaluating No-Reference Image Quality Assessment Algorithms , 2015, IEEE Transactions on Image Processing.

[52]  Alan C. Bovik,et al.  Making a “Completely Blind” Image Quality Analyzer , 2013, IEEE Signal Processing Letters.

[53]  Alan C. Bovik,et al.  In-Capture Mobile Video Distortions: A Study of Subjective Behavior and Objective Algorithms , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[54]  Wen Gao,et al.  Novel Spatio-Temporal Structural Information Based Video Quality Metric , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[55]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Mylène C. Q. Farias,et al.  Using multiple spatio-temporal features to estimate video quality , 2018, Signal Process. Image Commun..

[57]  Rajiv Soundararajan,et al.  Study of Subjective and Objective Quality Assessment of Video , 2010, IEEE Transactions on Image Processing.

[58]  Gustavo de Veciana,et al.  Video Quality Assessment on Mobile Devices: Subjective, Behavioral and Objective Studies , 2012, IEEE Journal of Selected Topics in Signal Processing.