Saliency in objective video quality assessment: What is the ground truth?

Finding ways to be able to objectively and reliably assess video quality as would be perceived by humans has become a pressing concern in the multimedia community. To enhance the performance of video quality metrics (VQMs), a research trend is to incorporate visual saliency aspects. Existing approaches have focused on utilizing a computational saliency model to improve a VQM. Since saliency models still remain limited in predicting where people look in videos, the benefits of inclusion of saliency in VQMs may heavily depend on the accuracy of the saliency model used. To gain an insight into the actual added value of saliency in VQMs, ground truth saliency obtained from eye-tracking instead of computational saliency is an essential prerequisite. However, collecting eye-tracking data within the context of video quality is confronted with a bias due to the involvement of massive stimulus repetition. In this paper, we introduce a new experimental methodology to alleviate such potential bias and consequently, to be able to deliver reliable intended data. We recorded eye movements from 160 human observers while they freely viewed 160 video stimuli distorted with different distortion types at various degradation levels. We analyse the extent to which ground truth saliency as well as computational saliency actually benefit existing state of the art VQMs. Our dataset opens new challenges for saliency modelling in video quality research and helps better gauge progress in developing saliency-based VQMs.

[1]  Martin Reisslein,et al.  Objective Video Quality Assessment Methods: A Classification, Review, and Performance Comparison , 2011, IEEE Transactions on Broadcasting.

[2]  Ali Borji,et al.  CAT2000: A Large Scale Fixation Dataset for Boosting Saliency Research , 2015, ArXiv.

[3]  Klaus Reinhardt Experimental Design A Handbook And Dictionary For Medical And Behavioral Research , 2016 .

[4]  Wei Zhang,et al.  The Application of Visual Saliency Models in Objective Image Quality Assessment: A Statistical Evaluation , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[6]  Alan C. Bovik,et al.  Motion Tuned Spatio-Temporal Quality Assessment of Natural Videos , 2010, IEEE Transactions on Image Processing.

[7]  Stefan Winkler,et al.  Overview of Eye tracking Datasets , 2013, 2013 Fifth International Workshop on Quality of Multimedia Experience (QoMEX).

[8]  Claudio M. Privitera,et al.  Algorithms for Defining Visual Regions-of-Interest: Comparison with Eye Fixations , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Mylène C. Q. Farias,et al.  Video quality assessment using visual attention computational models , 2014, J. Electronic Imaging.

[10]  Liming Zhang,et al.  A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression , 2010, IEEE Transactions on Image Processing.

[11]  J. Theeuwes Top-down and bottom-up control of visual selection. , 2010, Acta psychologica.

[12]  Judith Redi,et al.  Quantifying the importance of preserving video quality in visually important regions at the expense of background content , 2015, Signal Process. Image Commun..

[13]  Peyman Milanfar,et al.  Static and space-time visual saliency detection by self-resemblance. , 2009, Journal of vision.

[14]  Tao Liu,et al.  Saliency Inspired Full-Reference Quality Metrics for Packet-Loss-Impaired Video , 2011, IEEE Transactions on Broadcasting.

[15]  Alexander Toet,et al.  Computational versus Psychophysical Bottom-Up Image Saliency: A Comparative Evaluation Study , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Vladimir Zlokolica,et al.  Salient Motion Features for Video Quality Assessment , 2011, IEEE Transactions on Image Processing.

[17]  Leslie G. Ungerleider,et al.  Mechanisms of visual attention in the human cortex. , 2000, Annual review of neuroscience.

[18]  Alan C. Bovik,et al.  41 OBJECTIVE VIDEO QUALITY ASSESSMENT , 2003 .

[19]  A. Bovik,et al.  OBJECTIVE VIDEO QUALITY ASSESSMENT , 2003 .

[20]  Ali Borji,et al.  Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study , 2013, IEEE Transactions on Image Processing.

[21]  Wei Zhang,et al.  Studying the added value of computational saliency in objective image quality assessment , 2014, 2014 IEEE Visual Communications and Image Processing Conference.

[22]  Patrick Le Callet,et al.  Do video coding impairments disturb the visual attention deployment? , 2010, Signal Process. Image Commun..

[23]  Patrick Le Callet,et al.  Overt visual attention for free-viewing and quality assessment tasks Impact of the regions of interest on a video quality metric , 2010 .

[24]  S. Yantis,et al.  Visual Attention: Bottom-Up Versus Top-Down , 2004, Current Biology.

[25]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  G. Keren,et al.  Between- or Within-Subjects Design: A Methodological Dilemma , 1993 .

[27]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[28]  Bin Fu,et al.  Visual attention modeling for video quality assessment with structural similarity , 2013, 2013 16th International Symposium on Wireless Personal Multimedia Communications (WPMC).

[29]  Patrick Le Callet,et al.  Linking distortion perception and visual saliency in H.264/AVC coded video containing packet loss , 2010, Visual Communications and Image Processing.

[30]  Ali Borji,et al.  State-of-the-Art in Visual Attention Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Sugato Chakravarty,et al.  Methodology for the subjective assessment of the quality of television pictures , 1995 .

[32]  Zhou Wang,et al.  Video quality assessment based on structural distortion measurement , 2004, Signal Process. Image Commun..

[33]  Laurent Itti,et al.  Automatic foveation for video compression using a neurobiological model of visual attention , 2004, IEEE Transactions on Image Processing.

[34]  Ingrid Heynderickx,et al.  Visual Attention in Objective Image Quality Assessment: Based on Eye-Tracking Data , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[35]  Rajiv Soundararajan,et al.  Study of Subjective and Objective Quality Assessment of Video , 2010, IEEE Transactions on Image Processing.

[36]  A. Greenwald Within-subjects designs: To use or not to use? , 1976 .