Deep fusion based video saliency detection

Abstract This paper introduces a novel video saliency model for salient object detection in videos. Firstly, we generate multi-level deep features via a symmetrical convolutional neural network, in which the inputs are the current frame and the optical flow image. Then, the multi-level deep features are integrated in a hierarchical manner using a fusion network, which deploys attention module to make a selection for deep features. Lastly, the integrated deep feature is combined with the boundary information originated from shallow layer of the feature extraction networks, and the saliency map is generated by the saliency prediction step. The key advantages of our model lie on the attention module, the hierarchical integration and the boundary information, in which the former one acts as weight filter and is used to select the most salient regions in deep features, the middle one gives an effective integration manner for features from different layers and the last one provides well-defined boundaries for saliency map. Extensive experiments are performed on two challenging video dataset, and the experimental results show that our model consistently outperforms the state-of-the-art saliency models in a large margin.

[1]  Huchuan Lu,et al.  Detect Globally, Refine Locally: A Novel Approach to Saliency Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[3]  Xiaofei Zhou,et al.  Spatiotemporal salient object detection by integrating with objectness , 2017, Multimedia Tools and Applications.

[4]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[5]  Xiaojiang Du,et al.  Significance Evaluation of Video Data Over Media Cloud Based on Compressed Sensing , 2016, IEEE Transactions on Multimedia.

[6]  Pierre Baldi,et al.  A principled approach to detecting surprising events in video , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Huchuan Lu,et al.  Salient Object Detection by Lossless Feature Reflection , 2018, IJCAI.

[8]  Liang-Tien Chia,et al.  Regularized Feature Reconstruction for Spatio-Temporal Saliency Detection , 2013, IEEE Transactions on Image Processing.

[9]  Xiaogang Wang,et al.  Saliency detection by multi-context deep learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Qi Tian,et al.  Enhancing Micro-video Understanding by Harnessing External Sounds , 2017, ACM Multimedia.

[11]  Huchuan Lu,et al.  Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Chun-Rong Huang,et al.  Video Saliency Map Detection by Dominant Camera Motion Removal , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Wei Liu,et al.  Improving Video Saliency Detection via Localized Estimation and Spatiotemporal Refinement , 2018, IEEE Transactions on Multimedia.

[14]  Xiaofei Zhou,et al.  Video saliency detection via bagging-based prediction and spatiotemporal propagation , 2018, J. Vis. Commun. Image Represent..

[15]  Nanning Zheng,et al.  Salient Object Detection: A Discriminative Regional Feature Integration Approach , 2013, International Journal of Computer Vision.

[16]  Jun Ma,et al.  NeuroStylist: Neural Compatibility Modeling for Clothing Matching , 2017, ACM Multimedia.

[17]  Wei Liu,et al.  Neural Compatibility Modeling with Attentive Knowledge Distillation , 2018, SIGIR.

[18]  Xiaogang Wang,et al.  Visual Importance and Distortion Guided Deep Image Quality Assessment Framework , 2017, IEEE Transactions on Multimedia.

[19]  Ling Shao,et al.  Correspondence Driven Saliency Transfer , 2016, IEEE Transactions on Image Processing.

[20]  Weisi Lin,et al.  Saliency-Guided Quality Assessment of Screen Content Images , 2016, IEEE Transactions on Multimedia.

[21]  Huchuan Lu,et al.  Learning Uncertain Convolutional Features for Accurate Saliency Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Mohan S. Kankanhalli,et al.  Attentive Long Short-Term Preference Modeling for Personalized Product Search , 2018, ACM Trans. Inf. Syst..

[23]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[24]  Jiandong Tian,et al.  RGBD Salient Object Detection via Deep Fusion , 2016, IEEE Transactions on Image Processing.

[25]  James M. Rehg,et al.  Video Segmentation by Tracking Many Figure-Ground Segments , 2013, 2013 IEEE International Conference on Computer Vision.

[26]  Nuno Vasconcelos,et al.  Spatiotemporal Saliency in Dynamic Scenes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Wen Gao,et al.  Utility-Driven Adaptive Preprocessing for Screen Content Video Compression , 2017, IEEE Transactions on Multimedia.

[28]  Ling Shao,et al.  Consistent Video Saliency Using Local Gradient Flow Optimization and Global Refinement , 2015, IEEE Transactions on Image Processing.

[29]  Bo Yan,et al.  Effective Video Retargeting With Jittery Assessment , 2014, IEEE Transactions on Multimedia.

[30]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Weisi Lin,et al.  A Video Saliency Detection Model in Compressed Domain , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[32]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[33]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[34]  Ling Shao,et al.  Video Salient Object Detection via Fully Convolutional Networks , 2017, IEEE Transactions on Image Processing.

[35]  Bingbing Ni,et al.  Video Object Segmentation Via Dense Trajectories , 2015, IEEE Transactions on Multimedia.

[36]  Fatih Murat Porikli,et al.  Saliency-aware geodesic video object segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Yan Liu,et al.  Video Saliency Detection via Dynamic Consistent Spatio-Temporal Attention Modelling , 2013, AAAI.

[38]  Tat-Seng Chua,et al.  Learning from Multiple Social Networks , 2016, Synthesis Lectures on Information Concepts, Retrieval, and Services.

[39]  Luc Van Gool,et al.  A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Trung-Nghia Le,et al.  Video Salient Object Detection Using Spatiotemporal Deep Features , 2017, IEEE Transactions on Image Processing.

[41]  Pong C. Yuen,et al.  Object motion detection using information theoretic spatio-temporal saliency , 2009, Pattern Recognit..

[42]  Aykut Erdem,et al.  Spatio-Temporal Saliency Networks for Dynamic Saliency Prediction , 2016, IEEE Transactions on Multimedia.

[43]  Linwei Ye,et al.  Saliency Detection for Unconstrained Videos Using Superpixel-Level Graph and Spatiotemporal Propagation , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[44]  Zhou Wang,et al.  Video saliency incorporating spatiotemporal cues and uncertainty weighting , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[45]  Zhi Liu,et al.  Stretchability-aware block scaling for image retargeting , 2013, J. Vis. Commun. Image Represent..

[46]  Gang Wang,et al.  Progressive Attention Guided Recurrent Network for Salient Object Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Hong Qin,et al.  Video Saliency Detection via Spatial-Temporal Fusion and Low-Rank Coherency Diffusion , 2017, IEEE Transactions on Image Processing.