Benchmark three-dimensional eye-tracking dataset for visual saliency prediction on stereoscopic three-dimensional video

Abstract. Visual attention models (VAMs) predict the location of image or video regions that are most likely to attract human attention. Although saliency detection is well explored for two-dimensional (2-D) image and video content, there have been only a few attempts made to design three-dimensional (3-D) saliency prediction models. Newly proposed 3-D VAMs have to be validated over large-scale video saliency prediction datasets, which also contain results of eye-tracking information. There are several publicly available eye-tracking datasets for 2-D image and video content. In the case of 3-D, however, there is still a need for large-scale video saliency datasets for the research community for validating different 3-D VAMs. We introduce a large-scale dataset containing eye-tracking data collected from 61 stereoscopic 3-D videos (and also 2-D versions of those), and 24 subjects participated in a free-viewing test. We evaluate the performance of the existing saliency detection methods over the proposed dataset. In addition, we created an online benchmark for validating the performance of the existing 2-D and 3-D VAMs and facilitating the addition of new VAMs to the benchmark. Our benchmark currently contains 50 different VAMs.

[1]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[2]  Xueqing Li,et al.  Leveraging stereopsis for saliency analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Junle Wang,et al.  Quantifying the relationship between visual salience and visual importance , 2010, Electronic Imaging.

[4]  Lihi Zelnik-Manor,et al.  Context-Aware Saliency Detection , 2012, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Ingrid Heynderickx,et al.  Studying the added value of visual attention in objective image quality metrics based on eye movement data , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[6]  Pietro Perona,et al.  Is bottom-up attention useful for object recognition? , 2004, CVPR 2004.

[7]  A. Hendrickson,et al.  Human photoreceptor topography , 1990, The Journal of comparative neurology.

[8]  L. Itti,et al.  A brief and selective history of attention , 2005 .

[9]  J. Jonas,et al.  Count and density of human retinal photoreceptors , 2004, Graefe's Archive for Clinical and Experimental Ophthalmology.

[10]  Peyman Milanfar,et al.  Static and space-time visual saliency detection by self-resemblance. , 2009, Journal of vision.

[11]  Li Xu,et al.  Hierarchical Saliency Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Ingrid Heynderickx,et al.  How the task of evaluating image quality influences viewing behavior , 2011, 2011 Third International Workshop on Quality of Multimedia Experience.

[13]  Nuno Vasconcelos,et al.  Integrated learning of saliency, complex features, and object detectors from cluttered scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Frédo Durand,et al.  A Benchmark of Computational Models of Saliency to Predict Human Fixations , 2012 .

[15]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[16]  Touradj Ebrahimi,et al.  EYEC3D: 3D video eye tracking dataset , 2014, 2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX).

[17]  Laurent Itti,et al.  Automatic foveation for video compression using a neurobiological model of visual attention , 2004, IEEE Transactions on Image Processing.

[18]  Jérome Fournier,et al.  How visual attention is modified by disparities and textures changes? , 2013, Electronic Imaging.

[19]  Harish Katti,et al.  An Eye Fixation Database for Saliency Detection in Images , 2010, ECCV.

[20]  Lihi Zelnik-Manor,et al.  What Makes a Patch Distinct? , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Laurent Itti,et al.  Biologically-Inspired Face Detection: Non-Brute-Force-Search Approach , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[22]  Hongyu Li,et al.  SDSP: A novel saliency detection method by combining simple priors , 2013, 2013 IEEE International Conference on Image Processing.

[23]  Narciso García,et al.  NAMA3DS1-COSPAD1: Subjective video quality assessment database on coding conditions introducing freely available high quality 3D stereoscopic sequences , 2012, 2012 Fourth International Workshop on Quality of Multimedia Experience.

[24]  Ran Ju,et al.  Depth saliency based on anisotropic center-surround difference , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[25]  C. Koch,et al.  Faces and text attract gaze independent of the task: Experimental data and computer model. , 2009, Journal of vision.

[26]  Yu Fu,et al.  Visual saliency detection by spatially weighted dissimilarity , 2011, CVPR 2011.

[27]  Heinz Hügli,et al.  Computing visual attention from scene depth , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[28]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[29]  Junle Wang,et al.  Computational Model of Stereoscopic 3D Visual Saliency , 2013, IEEE Transactions on Image Processing.

[30]  Esa Rahtu,et al.  Fast and Efficient Saliency Detection Using Sparse Sampling and Kernel Density Estimation , 2011, SCIA.

[31]  Michael Werman,et al.  Fast and robust Earth Mover's Distances , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[32]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[33]  Patrick Le Callet,et al.  A coherent computational approach to model bottom-up visual attention , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Ali Borji,et al.  Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study , 2013, IEEE Transactions on Image Processing.

[35]  Nicolas Riche,et al.  Saliency and Human Fixations: State-of-the-Art and Study of Comparison Metrics , 2013, 2013 IEEE International Conference on Computer Vision.

[36]  Touradj Ebrahimi,et al.  Impact of Ultra High Definition on Visual Attention , 2014, ACM Multimedia.

[37]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[38]  Weisi Lin,et al.  Saliency detection for stereoscopic images , 2013, 2013 Visual Communications and Image Processing (VCIP).

[39]  James J. Clark,et al.  Modal Control Of An Attentive Vision System , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[40]  Jonathon S. Hare,et al.  Scale Saliency: Applications in Visual Matching, Tracking and View-Based Object Recognition , 2003 .

[41]  Aykut Erdem,et al.  Visual saliency estimation by nonlinearly integrating features using region covariances. , 2013, Journal of vision.

[42]  Atsuto Maki,et al.  A computational model of depth-based attention , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[43]  Christel Chamaret,et al.  Adaptive 3D rendering based on region-of-interest , 2010, Electronic Imaging.

[44]  Albert Ali Salah,et al.  Video Retargeting: Video Saliency and Optical Flow Based Hybrid Approach , 2014, WICED@AAAI.

[45]  Yan Liu,et al.  Video Saliency Detection via Dynamic Consistent Spatio-Temporal Attention Modelling , 2013, AAAI.

[46]  Laurent Itti,et al.  Visual attention guided bit allocation in video compression , 2011, Image Vis. Comput..

[47]  Anthony J. Maeder,et al.  Visual attention modelling for subjective image quality databases , 2009, 2009 IEEE International Workshop on Multimedia Signal Processing.

[48]  Vidya Setlur,et al.  Retargeting Images and Video for Preserving Information Saliency , 2007, IEEE Computer Graphics and Applications.

[49]  Rainer Stiefelhagen,et al.  Quaternion-Based Spectral Saliency Detection for Eye Fixation Prediction , 2012, ECCV.

[50]  M. Bryden,et al.  Handedness and eye-dominance: a meta-analysis of their relationship. , 1996, Laterality.

[51]  Touradj Ebrahimi,et al.  Perceptual Video Compression: A Survey , 2012, IEEE Journal of Selected Topics in Signal Processing.

[52]  Judith Redi,et al.  Examining the effect of task on viewing behavior in videos using saliency maps , 2012, Electronic Imaging.

[53]  Feng Shao,et al.  3D Visual Attention for Stereoscopic Image Quality Assessment , 2014, J. Softw..

[54]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[55]  Yu-Wing Tai,et al.  Salient Region Detection via High-Dimensional Color Transform , 2014, CVPR.

[56]  Alan C. Bovik,et al.  Saliency Prediction on Stereoscopic Videos , 2014, IEEE Transactions on Image Processing.

[57]  Panos Nasiopoulos,et al.  Automatic stereoscopic 3D video reframing , 2012, 2012 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON).

[58]  Hanqing Lu,et al.  Saliency Cuts: An automatic approach to object segmentation , 2008, 2008 19th International Conference on Pattern Recognition.

[59]  John M. Henderson,et al.  Clustering of Gaze During Dynamic Scene Viewing is Predicted by Motion , 2011, Cognitive Computation.

[60]  Panos Nasiopoulos,et al.  Guidelines for an improved quality of experience in 3-D TV and 3-D mobile displays , 2012 .

[61]  Panos Nasiopoulos,et al.  A learning-based visual saliency prediction model for stereoscopic 3D video (LBVS-3D) , 2016, Multimedia Tools and Applications.

[62]  Lihi Zelnik-Manor,et al.  Saliency for image manipulation , 2013, The Visual Computer.

[63]  Huchuan Lu,et al.  Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[64]  Huchuan Lu,et al.  Bayesian Saliency via Low and mid Level Cues , 2022 .

[65]  Esa Rahtu,et al.  Segmenting Salient Objects from Images and Videos , 2010, ECCV.

[66]  L. Itti Author address: , 1999 .

[67]  Harish Katti,et al.  Depth Matters: Influence of Depth Cues on Visual Saliency , 2012, ECCV.

[68]  Mubarak Shah,et al.  Visual attention detection in video sequences using spatiotemporal cues , 2006, MM '06.

[69]  Weisi Lin,et al.  Saliency Detection in the Compressed Domain for Adaptive Image Retargeting , 2012, IEEE Transactions on Image Processing.

[70]  Naila Murray,et al.  Saliency estimation using a non-parametric low-level vision model , 2011, CVPR 2011.

[71]  Ken Chen,et al.  Stereoscopic Visual Attention Model for 3D Video , 2010, MMM.

[72]  Weisi Lin,et al.  Perceptual visual quality metrics: A survey , 2011, J. Vis. Commun. Image Represent..

[73]  Panos Nasiopoulos,et al.  Quality of experience of stereoscopic content on displays of different sizes: A comprehensive subjective evaluation , 2011, 2011 IEEE International Conference on Consumer Electronics (ICCE).

[74]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[75]  Stan Sclaroff,et al.  Saliency Detection: A Boolean Map Approach , 2013, 2013 IEEE International Conference on Computer Vision.

[76]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[77]  Ivan V. Bajic,et al.  Eye-Tracking Database for a Set of Standard Video Sequences , 2012, IEEE Transactions on Image Processing.

[78]  Judith Redi,et al.  Interactions of visual attention and quality perception , 2011, Electronic Imaging.

[79]  Junle Wang,et al.  An eye tracking database for stereoscopic video , 2014, 2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX).

[80]  Robert A. Marino,et al.  Free viewing of dynamic stimuli by humans and monkeys. , 2009, Journal of vision.

[81]  Benjamin W Tatler,et al.  The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. , 2007, Journal of vision.

[82]  Stefan Winkler,et al.  Overview of Eye tracking Datasets , 2013, 2013 Fifth International Workshop on Quality of Multimedia Experience (QoMEX).

[83]  Cristian Sminchisescu,et al.  Dynamic Eye Movement Datasets and Learnt Saliency Models for Visual Action Recognition , 2012, ECCV.

[84]  Hubert Konik,et al.  A Spatiotemporal Saliency Model for Video Surveillance , 2011, Cognitive Computation.

[85]  Nicolas Riche,et al.  RARE2012: A multi-scale rarity-based saliency detection with its comparative statistical analysis , 2013, Signal Process. Image Commun..

[86]  Weisi Lin,et al.  Modeling visual attention's modulatory aftereffects on visual sensitivity and quality evaluation , 2005, IEEE Transactions on Image Processing.

[87]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[88]  Ingrid Heynderickx,et al.  Visual Attention in Objective Image Quality Assessment: Based on Eye-Tracking Data , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[89]  Marko Tscherepanow,et al.  A saliency map based on sampling an image into random rectangular regions of interest , 2012, Pattern Recognit..

[90]  Nuno Vasconcelos,et al.  Spatiotemporal Saliency in Dynamic Scenes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[91]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[92]  Kien A. Hua,et al.  Image Retrieval Based on Regions of Interest , 2003, IEEE Trans. Knowl. Data Eng..

[93]  D. S. Wooding,et al.  Fixation maps: quantifying eye-movement traces , 2002, ETRA.

[94]  Martin D. Levine,et al.  Visual Saliency Based on Scale-Space Analysis in the Frequency Domain , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[95]  Sudeep Sarkar,et al.  Saliency in images and video: a brief survey , 2012 .

[96]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[97]  Gert Kootstra,et al.  Predicting Eye Fixations on Complex Visual Stimuli Using Local Symmetry , 2011, Cognitive Computation.

[98]  Krista A. Ehinger,et al.  Modelling search for people in 900 scenes: A combined source model of eye guidance , 2009 .

[99]  Michael Lindenbaum,et al.  On the distribution of saliency , 2004, CVPR 2004.

[100]  Jin-Yi Chang,et al.  A novel salient region extraction based on color and texture features , 2009, 2009 International Conference on Wavelet Analysis and Pattern Recognition.

[101]  Pjh Pieter Seuntiëns,et al.  Visual experience of 3D TV , 2006 .

[102]  HongJiang Zhang,et al.  A model of motion attention for video skimming , 2002, Proceedings. International Conference on Image Processing.

[103]  Namho Hur,et al.  Stereoscopic 3D visual attention model considering comfortable viewing , 2012 .

[104]  D. E. Irwin,et al.  Visual Memory Within and Across Fixations , 1992 .

[105]  Kaccie Y. Li,et al.  Intersubject variability of foveal cone photoreceptor density in relation to eye length. , 2010, Investigative ophthalmology & visual science.

[106]  Wei Chen,et al.  Region-of-Interest intra prediction for H.264/AVC error resilience , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[107]  Markus Vincze,et al.  Learning What Matters: Combining Probabilistic Models of 2D and 3D Saliency Cues , 2011, ICVS.

[108]  Liming Zhang,et al.  A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression , 2010, IEEE Transactions on Image Processing.

[109]  Christine Fernandez-Maloigne,et al.  Using monocular depth cues for modeling stereoscopic 3D saliency , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[110]  Thomas Martinetz,et al.  Variability of eye movements when viewing dynamic natural scenes. , 2010, Journal of vision.

[111]  Umesh Rajashekar,et al.  DOVES: a database of visual eye movements. , 2009, Spatial vision.

[112]  Bu-Sung Lee,et al.  Bottom-Up Saliency Detection Model Based on Human Visual Sensitivity and Amplitude Spectrum , 2012, IEEE Transactions on Multimedia.

[113]  John K. Tsotsos,et al.  Attention based on information maximization , 2010 .

[114]  Hyun Wook Park,et al.  Region-of-interest coding based on set partitioning in hierarchical trees , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[115]  Panos Nasiopoulos,et al.  The Effect of Frame Rate on 3D Video Quality and Bitrate , 2015, 1803.04826.

[116]  Christof Koch,et al.  Predicting human gaze using low-level saliency combined with face detection , 2007, NIPS.

[117]  Patrick Le Callet,et al.  Linking distortion perception and visual saliency in H.264/AVC coded video containing packet loss , 2010, Visual Communications and Image Processing.

[118]  Patrick Le Callet,et al.  Quality of experience model for 3DTV , 2012, Electronic Imaging.

[119]  Tim K Marks,et al.  SUN: A Bayesian framework for saliency using natural statistics. , 2008, Journal of vision.

[120]  John K. Tsotsos,et al.  Saliency, attention, and visual search: an information theoretic approach. , 2009, Journal of vision.

[121]  Christof Koch,et al.  Learning a saliency map using fixated locations in natural scenes. , 2011, Journal of vision.

[122]  Ingrid Heynderickx,et al.  Comparative Study of Fixation Density Maps , 2013, IEEE Transactions on Image Processing.

[123]  Karen O. Egiazarian,et al.  Classification and simulation of stereoscopic artifacts in mobile 3DTV content , 2009, Electronic Imaging.

[124]  Zhi Liu,et al.  Salient region detection for stereoscopic images , 2014, 2014 19th International Conference on Digital Signal Processing.

[125]  Simone Frintrop,et al.  Visual Attention for Object Recognition in Spatial 3D Data , 2004, WAPCV.