SalGCN: Saliency Prediction for 360-Degree Images Based on Spherical Graph Convolutional Networks

The non-Euclidean geometry characteristic poses a challenge to the saliency prediction for 360-degree images. Since spherical data cannot be projected onto a single plane without distortion, existing saliency prediction methods based on traditional CNNs are inefficient. In this paper, we propose a saliency prediction framework for 360-degree images based on graph convolutional networks (SalGCN), which directly applies to the spherical graph signals. Specifically, we adopt the Geodesic ICOsahedral Pixelation (GICOPix) to construct a spherical graph signal from a spherical image in equirectangular projection (ERP) format. We then propose a graph saliency prediction network to directly extract the spherical features and generate the spherical graph saliency map, where we design an unpooling method suitable for spherical graph signals based on linear interpolation. The network training process is realized by modeling the node regression problem of the input and output spherical graph signals, where we further design a Kullback-Leibler (KL) divergence loss with sparse consistency to make the sparseness of the saliency map closer to the ground truth. Eventually, to obtain the ERP format saliency map for evaluation, we further propose a spherical crown-based (SCB) interpolation method to convert the output spherical graph saliency map into a saliency map in ERP format. Experiments show that our SalGCN can achieve comparable or even better saliency prediction performance both subjectively and objectively, with a much lower computation complexity.

[1]  Shenghua Gao,et al.  Saliency Detection in 360 ^\circ ∘ Videos , 2018, ECCV.

[2]  Kristen Grauman,et al.  Making 360° Video Watchable in 2D: Learning Videography for Click Free Viewing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Rafael Monroy,et al.  SalNet360: Saliency Maps for omni-directional images with CNN , 2017, Signal Process. Image Commun..

[4]  Patrick Le Callet,et al.  Toolbox and dataset for the development of saliency and scanpath models for omnidirectional/360° still images , 2018, Signal Process. Image Commun..

[5]  Cagri Ozcinar,et al.  Visual Attention-Aware Omnidirectional Video Streaming Using Optimal Tiles for Virtual Reality , 2019, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[6]  Olivier Déforges,et al.  Salgan360: Visual Saliency Prediction On 360 Degree Images With Generative Adversarial Networks , 2018, 2018 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[7]  R. Venkatesh Babu,et al.  DeepFix: A Fully Convolutional Neural Network for Predicting Human Eye Fixations , 2015, IEEE Transactions on Image Processing.

[8]  Zhenzhong Chen,et al.  A saliency prediction model on 360 degree images using color dictionary based sparse representation , 2018, Signal Process. Image Commun..

[9]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[10]  Alexander Raake,et al.  GBVS360, BMS360, ProSal: Extending existing saliency prediction models from 2D to omnidirectional images , 2018, Signal Process. Image Commun..

[11]  Patrick Rives,et al.  A spherical robot-centered representation for urban navigation , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12]  Noel E. O'Connor,et al.  Shallow and Deep Convolutional Networks for Saliency Prediction , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Kristen Grauman,et al.  Kernel Transformer Networks for Compact Spherical Convolution , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Kristen Grauman,et al.  Flat2Sphere: Learning Spherical Convolution for Fast Features from 360° Imagery , 2017, NIPS 2017.

[15]  Stan Sclaroff,et al.  Saliency Detection: A Boolean Map Approach , 2013, 2013 IEEE International Conference on Computer Vision.

[16]  Xuelong Li,et al.  SalNet: Edge Constraint Based End-to-End Model for Salient Object Detection , 2018, PRCV.

[17]  Patrick Le Callet,et al.  A Dataset of Head and Eye Movements for 360 Degree Images , 2017, MMSys.

[18]  Hongkai Xiong,et al.  Single and Sequential Viewports Prediction for 360-Degree Video Streaming , 2019, 2019 IEEE International Symposium on Circuits and Systems (ISCAS).

[19]  Andreas Geiger,et al.  SphereNet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images , 2018, ECCV.

[20]  Takao Yamanaka,et al.  Saliency Map Estimation for Omni-Directional Image Considering Prior Distributions , 2018, 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC).