Graph-Based Saliency Fusion with Superpixel-Level Belief Propagation for 3D Fixation Prediction

In recent years, many 3D visual attention models (VAMs) have been proposed with diverse fusion methods, of which the main challenge lies in the inconsistence, or even conflicts of different saliency maps. To address the challenge, we propose a graph-based fusion method with superpixel-level belief propagation for 3D fixation prediction on stereoscopic video, which models the aggregation as a global optimization issue. After extracting multi-modality saliency maps, the fusion step is based on the graph constructed at superpixel level, and we design for the graph an energy function considering multi-modality constraints, which is minimized using the belief propagation algorithm. The experimental results on two databases demonstrate that the proposed model achieves competitive performance.

[1]  Panos Nasiopoulos,et al.  Benchmark three-dimensional eye-tracking dataset for visual saliency prediction on stereoscopic three-dimensional video , 2016, J. Electronic Imaging.

[2]  Weisi Lin,et al.  Saliency detection for stereoscopic images , 2013, 2013 Visual Communications and Image Processing (VCIP).

[3]  Heinz Hügli,et al.  Computing visual attention from scene depth , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[4]  Ali Borji,et al.  State-of-the-Art in Visual Attention Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Xinbo Gao,et al.  A superpixel-based CRF saliency detection approach , 2017, Neurocomputing.

[6]  Feng Shao,et al.  3D Visual Attention for Stereoscopic Image Quality Assessment , 2014, J. Softw..

[7]  Sabine Süsstrunk,et al.  Superpixels and Polygons Using Simple Non-iterative Clustering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Panos Nasiopoulos,et al.  A learning-based visual saliency prediction model for stereoscopic 3D video (LBVS-3D) , 2016, Multimedia Tools and Applications.

[9]  Sk Subhan,et al.  Dense and Sparse Reconstruction Error Based Saliency Descriptor , 2018 .

[10]  Jing Li,et al.  Visual Attention Modeling for Stereoscopic Video: A Benchmark and Computational Model , 2017, IEEE Transactions on Image Processing.

[11]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[12]  Stan Sclaroff,et al.  Saliency Detection: A Boolean Map Approach , 2013, 2013 IEEE International Conference on Computer Vision.

[13]  Panos Nasiopoulos,et al.  Automatic stereoscopic 3D video reframing , 2012, 2012 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON).

[14]  Namho Hur,et al.  Stereoscopic 3D visual attention model considering comfortable viewing , 2012 .

[15]  Qiong Liu,et al.  A robust 3D visual saliency computation model for human fixation prediction of stereoscopic videos , 2017, 2017 IEEE Visual Communications and Image Processing (VCIP).

[16]  Junle Wang,et al.  Computational Model of Stereoscopic 3D Visual Saliency , 2013, IEEE Transactions on Image Processing.

[17]  Ran Ju,et al.  Depth saliency based on anisotropic center-surround difference , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[18]  Xueqing Li,et al.  Leveraging stereopsis for saliency analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Alan C. Bovik,et al.  Saliency Prediction on Stereoscopic Videos , 2014, IEEE Transactions on Image Processing.

[21]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[22]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[23]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.