Benchmark 3D eye-tracking dataset for visual saliency prediction on stereoscopic 3D video

Visual Attention Models (VAMs) predict the location of an image or video regions that are most likely to attract human attention. Although saliency detection is well explored for 2D image and video content, there are only few attempts made to design 3D saliency prediction models. Newly proposed 3D visual attention models have to be validated over large-scale video saliency prediction datasets, which also contain results of eye-tracking information. There are several publicly available eye-tracking datasets for 2D image and video content. In the case of 3D, however, there is still a need for large-scale video saliency datasets for the research community for validating different 3D-VAMs. In this paper, we introduce a large-scale dataset containing eye-tracking data collected from 61 stereoscopic 3D videos (and also 2D versions of those) and 24 subjects participated in a free-viewing test. We evaluate the performance of the existing saliency detection methods over the proposed dataset. In addition, we created an online benchmark for validating the performance of the existing 2D and 3D visual attention models and facilitate addition of new VAMs to the benchmark. Our benchmark currently contains 50 different VAMs.

[1]  Ken Chen,et al.  Stereoscopic Visual Attention Model for 3D Video , 2010, MMM.

[2]  Weisi Lin,et al.  Perceptual visual quality metrics: A survey , 2011, J. Vis. Commun. Image Represent..

[3]  John K. Tsotsos,et al.  Saliency, attention, and visual search: an information theoretic approach. , 2009, Journal of vision.

[4]  Christof Koch,et al.  Learning a saliency map using fixated locations in natural scenes. , 2011, Journal of vision.

[5]  Ingrid Heynderickx,et al.  Comparative Study of Fixation Density Maps , 2013, IEEE Transactions on Image Processing.

[6]  Panos Nasiopoulos,et al.  Guidelines for an improved quality of experience in 3-D TV and 3-D mobile displays , 2012 .

[7]  Panos Nasiopoulos,et al.  A learning-based visual saliency prediction model for stereoscopic 3D video (LBVS-3D) , 2016, Multimedia Tools and Applications.

[8]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[9]  C. Koch,et al.  Faces and text attract gaze independent of the task: Experimental data and computer model. , 2009, Journal of vision.

[10]  J. Jonas,et al.  Count and density of human retinal photoreceptors , 2004, Graefe's Archive for Clinical and Experimental Ophthalmology.

[11]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[12]  Michael Werman,et al.  Fast and robust Earth Mover's Distances , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[13]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[14]  Weisi Lin,et al.  Saliency detection for stereoscopic images , 2013, 2013 Visual Communications and Image Processing (VCIP).

[15]  James J. Clark,et al.  Modal Control Of An Attentive Vision System , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[16]  Laurent Itti,et al.  Visual attention guided bit allocation in video compression , 2011, Image Vis. Comput..

[17]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[18]  D. E. Irwin,et al.  Visual Memory Within and Across Fixations , 1992 .

[19]  Vidya Setlur,et al.  Retargeting Images and Video for Preserving Information Saliency , 2007, IEEE Computer Graphics and Applications.

[20]  Feng Shao,et al.  3D Visual Attention for Stereoscopic Image Quality Assessment , 2014, J. Softw..

[21]  Kaccie Y. Li,et al.  Intersubject variability of foveal cone photoreceptor density in relation to eye length. , 2010, Investigative ophthalmology & visual science.

[22]  Wei Chen,et al.  Region-of-Interest intra prediction for H.264/AVC error resilience , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[23]  Hanqing Lu,et al.  Saliency Cuts: An automatic approach to object segmentation , 2008, 2008 19th International Conference on Pattern Recognition.

[24]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[25]  Kien A. Hua,et al.  Image Retrieval Based on Regions of Interest , 2003, IEEE Trans. Knowl. Data Eng..

[26]  D. S. Wooding,et al.  Fixation maps: quantifying eye-movement traces , 2002, ETRA.

[27]  Junle Wang,et al.  An eye tracking database for stereoscopic video , 2014, 2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX).

[28]  Robert A. Marino,et al.  Free viewing of dynamic stimuli by humans and monkeys. , 2009, Journal of vision.

[29]  Martin D. Levine,et al.  Visual Saliency Based on Scale-Space Analysis in the Frequency Domain , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Sudeep Sarkar,et al.  Saliency in images and video: a brief survey , 2012 .

[31]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  John M. Henderson,et al.  Clustering of Gaze During Dynamic Scene Viewing is Predicted by Motion , 2011, Cognitive Computation.

[33]  Panos Nasiopoulos,et al.  Quality of experience of stereoscopic content on displays of different sizes: A comprehensive subjective evaluation , 2011, 2011 IEEE International Conference on Consumer Electronics (ICCE).

[34]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[35]  Yan Liu,et al.  Video Saliency Detection via Dynamic Consistent Spatio-Temporal Attention Modelling , 2013, AAAI.

[36]  Huchuan Lu,et al.  Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Nicolas Riche,et al.  RARE2012: A multi-scale rarity-based saliency detection with its comparative statistical analysis , 2013, Signal Process. Image Commun..

[38]  Weisi Lin,et al.  Modeling visual attention's modulatory aftereffects on visual sensitivity and quality evaluation , 2005, IEEE Transactions on Image Processing.

[39]  Alan C. Bovik,et al.  Saliency Prediction on Stereoscopic Videos , 2014, IEEE Transactions on Image Processing.

[40]  Harish Katti,et al.  Depth Matters: Influence of Depth Cues on Visual Saliency , 2012, ECCV.

[41]  Yu Fu,et al.  Visual saliency detection by spatially weighted dissimilarity , 2011, CVPR 2011.

[42]  Heinz Hügli,et al.  Computing visual attention from scene depth , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[43]  Zhi Liu,et al.  Salient region detection for stereoscopic images , 2014, 2014 19th International Conference on Digital Signal Processing.

[44]  Gert Kootstra,et al.  Predicting Eye Fixations on Complex Visual Stimuli Using Local Symmetry , 2011, Cognitive Computation.

[45]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[46]  Ivan V. Bajic,et al.  Eye-Tracking Database for a Set of Standard Video Sequences , 2012, IEEE Transactions on Image Processing.

[47]  Ali Borji,et al.  Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study , 2013, IEEE Transactions on Image Processing.

[48]  Simone Frintrop,et al.  Visual Attention for Object Recognition in Spatial 3D Data , 2004, WAPCV.

[49]  Judith Redi,et al.  Interactions of visual attention and quality perception , 2011, Electronic Imaging.

[50]  Krista A. Ehinger,et al.  Modelling search for people in 900 scenes: A combined source model of eye guidance , 2009 .

[51]  Michael Lindenbaum,et al.  On the distribution of saliency , 2004, CVPR 2004.

[52]  Jin-Yi Chang,et al.  A novel salient region extraction based on color and texture features , 2009, 2009 International Conference on Wavelet Analysis and Pattern Recognition.

[53]  Pjh Pieter Seuntiëns,et al.  Visual experience of 3D TV , 2006 .

[54]  HongJiang Zhang,et al.  A model of motion attention for video skimming , 2002, Proceedings. International Conference on Image Processing.

[55]  Namho Hur,et al.  Stereoscopic 3D visual attention model considering comfortable viewing , 2012 .

[56]  Nicolas Riche,et al.  Saliency and Human Fixations: State-of-the-Art and Study of Comparison Metrics , 2013, 2013 IEEE International Conference on Computer Vision.

[57]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[58]  Atsuto Maki,et al.  A computational model of depth-based attention , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[59]  Christel Chamaret,et al.  Adaptive 3D rendering based on region-of-interest , 2010, Electronic Imaging.

[60]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[61]  Anthony J. Maeder,et al.  Visual attention modelling for subjective image quality databases , 2009, 2009 IEEE International Workshop on Multimedia Signal Processing.

[62]  Junle Wang,et al.  Computational Model of Stereoscopic 3D Visual Saliency , 2013, IEEE Transactions on Image Processing.

[63]  Esa Rahtu,et al.  Fast and Efficient Saliency Detection Using Sparse Sampling and Kernel Density Estimation , 2011, SCIA.

[64]  Patrick Le Callet,et al.  A coherent computational approach to model bottom-up visual attention , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  Patrick Le Callet,et al.  Linking distortion perception and visual saliency in H.264/AVC coded video containing packet loss , 2010, Visual Communications and Image Processing.

[66]  Huchuan Lu,et al.  Bayesian Saliency via Low and mid Level Cues , 2022 .

[67]  Esa Rahtu,et al.  Segmenting Salient Objects from Images and Videos , 2010, ECCV.

[68]  L. Itti Author address: , 1999 .

[69]  Junle Wang,et al.  Quantifying the relationship between visual salience and visual importance , 2010, Electronic Imaging.

[70]  Mubarak Shah,et al.  Visual attention detection in video sequences using spatiotemporal cues , 2006, MM '06.

[71]  Weisi Lin,et al.  Saliency Detection in the Compressed Domain for Adaptive Image Retargeting , 2012, IEEE Transactions on Image Processing.

[72]  Touradj Ebrahimi,et al.  Perceptual Video Compression: A Survey , 2012, IEEE Journal of Selected Topics in Signal Processing.

[73]  Benjamin W Tatler,et al.  The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. , 2007, Journal of vision.

[74]  Patrick Le Callet,et al.  Quality of experience model for 3DTV , 2012, Electronic Imaging.

[75]  Tim K Marks,et al.  SUN: A Bayesian framework for saliency using natural statistics. , 2008, Journal of vision.

[76]  Harish Katti,et al.  An Eye Fixation Database for Saliency Detection in Images , 2010, ECCV.

[77]  Lihi Zelnik-Manor,et al.  What Makes a Patch Distinct? , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[78]  Laurent Itti,et al.  Biologically-Inspired Face Detection: Non-Brute-Force-Search Approach , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[79]  A. Torralba,et al.  Fixations on low-resolution images. , 2010, Journal of vision.

[80]  Narciso García,et al.  NAMA3DS1-COSPAD1: Subjective video quality assessment database on coding conditions introducing freely available high quality 3D stereoscopic sequences , 2012, 2012 Fourth International Workshop on Quality of Multimedia Experience.

[81]  Hyun Wook Park,et al.  Region-of-interest coding based on set partitioning in hierarchical trees , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[82]  Panos Nasiopoulos,et al.  The Effect of Frame Rate on 3D Video Quality and Bitrate , 2015, 1803.04826.

[83]  Christof Koch,et al.  Predicting human gaze using low-level saliency combined with face detection , 2007, NIPS.

[84]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[85]  Yu-Wing Tai,et al.  Salient Region Detection via High-Dimensional Color Transform , 2014, CVPR.

[86]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[87]  Laurent Itti,et al.  Automatic foveation for video compression using a neurobiological model of visual attention , 2004, IEEE Transactions on Image Processing.

[88]  Jérome Fournier,et al.  How visual attention is modified by disparities and textures changes? , 2013, Electronic Imaging.

[89]  Hongyu Li,et al.  SDSP: A novel saliency detection method by combining simple priors , 2013, 2013 IEEE International Conference on Image Processing.

[90]  Touradj Ebrahimi,et al.  Impact of Ultra High Definition on Visual Attention , 2014, ACM Multimedia.

[91]  Jonathon S. Hare,et al.  Scale Saliency: Applications in Visual Matching, Tracking and View-Based Object Recognition , 2003 .

[92]  Aykut Erdem,et al.  Visual saliency estimation by nonlinearly integrating features using region covariances. , 2013, Journal of vision.

[93]  M. Bryden,et al.  Handedness and eye-dominance: a meta-analysis of their relationship. , 1996, Laterality.

[94]  Umesh Rajashekar,et al.  DOVES: a database of visual eye movements. , 2009, Spatial vision.

[95]  Bu-Sung Lee,et al.  Bottom-Up Saliency Detection Model Based on Human Visual Sensitivity and Amplitude Spectrum , 2012, IEEE Transactions on Multimedia.

[96]  John K. Tsotsos,et al.  Attention based on information maximization , 2010 .

[97]  Ingrid Heynderickx,et al.  Visual Attention in Objective Image Quality Assessment: Based on Eye-Tracking Data , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[98]  Marko Tscherepanow,et al.  A saliency map based on sampling an image into random rectangular regions of interest , 2012, Pattern Recognit..

[99]  Nuno Vasconcelos,et al.  Spatiotemporal Saliency in Dynamic Scenes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[100]  Judith Redi,et al.  Examining the effect of task on viewing behavior in videos using saliency maps , 2012, Electronic Imaging.

[101]  D. M. Green,et al.  Signal detection theory and psychophysics , 1966 .

[102]  Hubert Konik,et al.  A Spatiotemporal Saliency Model for Video Surveillance , 2011, Cognitive Computation.

[103]  Xueqing Li,et al.  Leveraging stereopsis for saliency analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[104]  Ingrid Heynderickx,et al.  How the task of evaluating image quality influences viewing behavior , 2011, 2011 Third International Workshop on Quality of Multimedia Experience.

[105]  Frédo Durand,et al.  A Benchmark of Computational Models of Saliency to Predict Human Fixations , 2012 .

[106]  Touradj Ebrahimi,et al.  EYEC3D: 3D video eye tracking dataset , 2014, 2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX).

[107]  Markus Vincze,et al.  Learning What Matters: Combining Probabilistic Models of 2D and 3D Saliency Cues , 2011, ICVS.

[108]  Liming Zhang,et al.  A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression , 2010, IEEE Transactions on Image Processing.

[109]  Christine Fernandez-Maloigne,et al.  Using monocular depth cues for modeling stereoscopic 3D saliency , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[110]  Thomas Martinetz,et al.  Variability of eye movements when viewing dynamic natural scenes. , 2010, Journal of vision.

[111]  Panos Nasiopoulos,et al.  Automatic stereoscopic 3D video reframing , 2012, 2012 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON).

[112]  Stefan Winkler,et al.  Overview of Eye tracking Datasets , 2013, 2013 Fifth International Workshop on Quality of Multimedia Experience (QoMEX).

[113]  Cristian Sminchisescu,et al.  Dynamic Eye Movement Datasets and Learnt Saliency Models for Visual Action Recognition , 2012, ECCV.

[114]  Ingrid Heynderickx,et al.  Studying the added value of visual attention in objective image quality metrics based on eye movement data , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[115]  Pietro Perona,et al.  Is bottom-up attention useful for object recognition? , 2004, CVPR 2004.

[116]  A. Gotchev,et al.  Classification of stereoscopic artefacts , 2009 .

[117]  A. Hendrickson,et al.  Human photoreceptor topography , 1990, The Journal of comparative neurology.

[118]  L. Itti,et al.  A brief and selective history of attention , 2005 .

[119]  Nuno Vasconcelos,et al.  Integrated learning of saliency, complex features, and object detectors from cluttered scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).