论文信息 - Visual Attention Prediction for Stereoscopic Video by Multi-Module Fully Convolutional Network

Visual Attention Prediction for Stereoscopic Video by Multi-Module Fully Convolutional Network

Visual attention is an important mechanism in the human visual system (HVS) and there have been numerous saliency detection algorithms designed for 2D images/video recently. However, the research for fixation detection of stereoscopic video is still limited and challenging due to the complicated depth and motion information. In this paper, we design a novel multi-module fully convolutional network (MM-FCN) for fixation detection of stereoscopic video. Specifically, we design a fully convolutional network for spatial saliency prediction (S-FCN), where the initial spatial saliency map of stereoscopic video is learned by image database of object detection. Furthermore, the fully convolutional network for temporal saliency prediction (T-FCN) is constructed by combining saliency results from S-FCN and motion information from video frames. Finally, the fully convolutional network for depth fixation prediction (D-FCN) is designed to compute the final fixation map of stereoscopic video by learning depth features with spatiotemporal features from T-FCN. The experimental results show that the proposed MM-FCN can predict fixation results for stereoscopic video more effectively and efficiently than other related fixation prediction methods.

[1] Junwei Han,et al. DHSNet: Deep Hierarchical Saliency Network for Salient Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Heinz Hügli,et al. Computing visual attention from scene depth , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[3] Yizhou Yu,et al. Deep Contrast Learning for Salient Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Junle Wang,et al. Computational Model of Stereoscopic 3D Visual Saliency , 2013, IEEE Transactions on Image Processing.

[5] Ken Chen,et al. Stereoscopic Visual Attention Model for 3D Video , 2010, MMM.

[6] Ali Borji,et al. Exploiting local and global patch rarities for saliency detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[7] Jian Sun,et al. Saliency Optimization from Robust Background Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8] Deyu Meng,et al. Co-Saliency Detection via a Self-Paced Multiple-Instance Learning Framework , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Jitendra Malik,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence Segmentation of Moving Objects by Long Term Video Analysis , 2022 .

[10] Alan C. Bovik,et al. Saliency Prediction on Stereoscopic Videos , 2014, IEEE Transactions on Image Processing.

[11] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12] Tianming Liu,et al. Learning to Predict Eye Fixations via Multiresolution Convolutional Neural Networks , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[13] Nanning Zheng,et al. Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[15] Hong Qin,et al. Video Saliency Detection via Spatial-Temporal Fusion and Low-Rank Coherency Diffusion , 2017, IEEE Transactions on Image Processing.

[16] Wenguan Wang,et al. Deep Visual Attention Prediction , 2017, IEEE Transactions on Image Processing.

[17] Jitendra Malik,et al. Object Segmentation by Long Term Analysis of Point Trajectories , 2010, ECCV.

[18] Huchuan Lu,et al. Adaptive Metric Learning for Saliency Detection , 2015, IEEE Transactions on Image Processing.

[19] Harish Katti,et al. Depth Matters: Influence of Depth Cues on Visual Saliency , 2012, ECCV.

[20] Junwei Han,et al. CNNs-Based RGB-D Saliency Detection via Cross-View Transfer and Multiview Fusion. , 2018, IEEE transactions on cybernetics.

[21] Christof Koch,et al. A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[22] Huchuan Lu,et al. CNN for saliency detection with low-level feature integration , 2017, Neurocomputing.

[23] Jing Li,et al. Visual Attention Modeling for Stereoscopic Video: A Benchmark and Computational Model , 2017, IEEE Transactions on Image Processing.

[24] Yizhou Yu,et al. Visual Saliency Detection Based on Multiscale Deep CNN Features , 2016, IEEE Transactions on Image Processing.

[25] Ling Shao,et al. Video Salient Object Detection via Fully Convolutional Networks , 2017, IEEE Transactions on Image Processing.

[26] Zhou Wang,et al. Video saliency incorporating spatiotemporal cues and uncertainty weighting , 2013, ICME.

[27] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28] Qingming Huang,et al. Saliency Detection for Stereoscopic Images Based on Depth Confidence Analysis and Multiple Cues Fusion , 2016, IEEE Signal Processing Letters.

[29] Victor Leboran,et al. Dynamic Whitening Saliency , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30] Frédo Durand,et al. Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[31] Aykut Erdem,et al. Spatio-Temporal Saliency Networks for Dynamic Saliency Prediction , 2016, IEEE Transactions on Multimedia.

[32] Linwei Ye,et al. Saliency Detection for Unconstrained Videos Using Superpixel-Level Graph and Spatiotemporal Propagation , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[33] Li Xu,et al. Hierarchical Image Saliency Detection on Extended CSSD , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34] Yueting Zhuang,et al. Saliency Detection within a Deep Convolutional Architecture , 2014, AAAI 2014.

[35] Huchuan Lu,et al. Saliency Detection with Recurrent Fully Convolutional Networks , 2016, ECCV.

[36] Huchuan Lu,et al. Saliency Detection via Absorbing Markov Chain , 2013, 2013 IEEE International Conference on Computer Vision.

[37] Zhuowen Tu,et al. Deeply Supervised Salient Object Detection with Short Connections , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Luc Van Gool,et al. A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Tiejun Huang,et al. Visual Saliency with Statistical Priors , 2013, International Journal of Computer Vision.

[40] Ling Shao,et al. Consistent Video Saliency Using Local Gradient Flow Optimization and Global Refinement , 2015, IEEE Transactions on Image Processing.

[41] Lihi Zelnik-Manor,et al. What Makes a Patch Distinct? , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[42] Tim J. Smith,et al. Do low-level visual features have a causal influence on gaze during dynamic scene viewing? , 2013 .

[43] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[44] Fatih Murat Porikli,et al. Saliency-aware geodesic video object segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45] Ali Borji,et al. State-of-the-Art in Visual Attention Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46] Huchuan Lu,et al. Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[47] Xing Cai,et al. PDNet: Prior-Model Guided Depth-Enhanced Network for Salient Object Detection , 2018, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[48] Pedro A. Amado Assunção,et al. A method to compute saliency regions in 3D video based on fusion of feature maps , 2015, 2015 IEEE International Conference on Multimedia and Expo (ICME).

[49] Noel E. O'Connor,et al. Shallow and Deep Convolutional Networks for Saliency Prediction , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50] Xiaogang Wang,et al. Saliency detection by multi-context deep learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51] Shi-Min Hu,et al. SalientShape: group saliency in image collections , 2013, The Visual Computer.

[52] Christel Chamaret,et al. Adaptive 3D rendering based on region-of-interest , 2010, Electronic Imaging.

[53] Yueting Zhuang,et al. DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection , 2015, IEEE Transactions on Image Processing.

[54] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[55] Weisi Lin,et al. Saliency detection for stereoscopic images , 2013, 2013 Visual Communications and Image Processing (VCIP).

[56] Youfu Li,et al. Progressively Complementarity-Aware Fusion Network for RGB-D Salient Object Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[57] Huchuan Lu,et al. Deep networks for saliency detection via local estimation and global search , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58] A. Treisman,et al. A feature-integration theory of attention , 1980, Cognitive Psychology.

[59] Asha Iyer,et al. Components of bottom-up gaze allocation in natural images , 2005, Vision Research.

[60] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[61] Junle Wang,et al. An eye tracking database for stereoscopic video , 2014, 2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX).

[62] Yuting Zhang,et al. Improving object detection with deep convolutional networks via Bayesian optimization and structured prediction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63] Pietro Perona,et al. Graph-Based Visual Saliency , 2006, NIPS.