Scanpath and saliency prediction on 360 degree images

We introduce deep neural networks for scanpath and saliency prediction trained on 360-degree images. The scanpath prediction model called SaltiNet is based on a temporal-aware novel representation of saliency information named the saliency volume. The first part of the network consists of a model trained to generate saliency volumes, whose parameters are fit by back-propagation using a binary cross entropy (BCE) loss over downsampled versions of the saliency volumes. Sampling strategies over these volumes are used to generate scanpaths over the 360-degree images. Our experiments show the advantages of using saliency volumes, and how they can be used for related tasks. We also show how a similar architecture achieves state-of-the-art performance for the related task of saliency map prediction. Our source code and trained models available at https://github.com/massens/saliency-360salient-2017.

[1]  Ali Borji,et al.  Boosting bottom-up and top-down visual features for saliency estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Patrick Le Callet,et al.  Toolbox and dataset for the development of saliency and scanpath models for omnidirectional/360° still images , 2018, Signal Process. Image Commun..

[3]  Qi Zhao,et al.  SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Patrick Le Callet,et al.  Which saliency weighting for omni directional image quality assessment? , 2017, 2017 Ninth International Conference on Quality of Multimedia Experience (QoMEX).

[5]  Zhenzhong Chen,et al.  A saliency prediction model on 360 degree images using color dictionary based sparse representation , 2018, Signal Process. Image Commun..

[6]  Thierry Baccino,et al.  Methods for comparing scanpaths and saliency maps: strengths and weaknesses , 2012, Behavior Research Methods.

[7]  Cagri Ozcinar,et al.  Look around you: Saliency maps for omnidirectional images in VR applications , 2017, 2017 Ninth International Conference on Quality of Multimedia Experience (QoMEX).

[8]  Yuming Fang,et al.  A novel superpixel-based saliency detection model for 360-degree images , 2018, Signal Process. Image Commun..

[9]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[10]  Nicolas Riche,et al.  Saliency and Human Fixations: State-of-the-Art and Study of Comparison Metrics , 2013, 2013 IEEE International Conference on Computer Vision.

[11]  Qi Zhao,et al.  SALICON: Saliency in Context , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Patrick Le Callet,et al.  A Dataset of Head and Eye Movements for 360 Degree Images , 2017, MMSys.

[13]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[14]  Wojciech Matusik,et al.  Eye Tracking for Everyone , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[16]  Mikhail Startsev,et al.  360-aware Saliency Estimation with Conventional Image Saliency Predictors , 2018, Signal Process. Image Commun..

[17]  Matthias Bethge,et al.  Information-theoretic model comparison unifies saliency metrics , 2015, Proceedings of the National Academy of Sciences.

[18]  Peter König,et al.  An extensive dataset of eye movements during viewing of complex images , 2017, Scientific Data.

[19]  Xiongkuo Min,et al.  The prediction of head and eye movement for 360 degree images , 2018, Signal Process. Image Commun..

[20]  Gordon Wetzstein,et al.  Saliency in VR: How Do People Explore Virtual Environments? , 2016, IEEE Transactions on Visualization and Computer Graphics.

[21]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Christof Koch,et al.  Predicting human gaze using low-level saliency combined with face detection , 2007, NIPS.

[23]  Marcus Nyström,et al.  A vector-based, multidimensional scanpath similarity measure , 2010, ETRA.

[24]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[25]  Rafael Monroy,et al.  SalNet360: Saliency Maps for omni-directional images with CNN , 2017, Signal Process. Image Commun..

[26]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[27]  Frédo Durand,et al.  Where Should Saliency Models Look Next? , 2016, ECCV.

[28]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[29]  Alexander Raake,et al.  GBVS360, BMS360, ProSal: Extending existing saliency prediction models from 2D to omnidirectional images , 2018, Signal Process. Image Commun..

[30]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.