Video-based salient object detection via spatio-temporal difference and coherence

Salient object detection aims to extract the attractive objects in images and videos. It can support various robotics tasks and multimedia applications, such as object detection, action recognition and scene analysis. However, efficient detection of salient objects in videos still faces many challenges as compared to that in still images. In this paper, we propose a novel video-based salient object detection method by exploring spatio-temporal characteristics of video content, i.e., spatial-temporal difference and spatial-temporal coherence. First, we initialize the saliency map for each keyframe by deriving spatial-temporal difference from color cue and motion cue. Next, we generate the saliency maps of other frames by propagating the saliency intra and inter frames with the constraint of spatio-temporal coherence. Finally, the saliency maps of both keyframes and non-keyframes are refined in the saliency propagation. In this way, we can detect salient objects in videos efficiently by exploring their spatio-temporal characteristics. We evaluate the proposed method on two public datasets, named SegTrackV2 and UVSD. The experimental results show that our method outperforms the state-of-the-art methods when taking account of both effectiveness and efficiency.

[1]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Jian Sun,et al.  Saliency Optimization from Robust Background Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Yan Liu,et al.  How important is location information in saliency detection of natural images , 2015, Multimedia Tools and Applications.

[4]  Huchuan Lu,et al.  Saliency detection via Cellular Automata , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[6]  Linwei Ye,et al.  Saliency Detection for Unconstrained Videos Using Superpixel-Level Graph and Spatiotemporal Propagation , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[7]  Jitendra Malik,et al.  Large Displacement Optical Flow: Descriptor Matching in Variational Motion Estimation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Yan Liu,et al.  Video Saliency Detection via Dynamic Consistent Spatio-Temporal Attention Modelling , 2013, AAAI.

[9]  Changsheng Xu,et al.  Right buddy makes the difference: an early exploration of social relation analysis in multimedia applications , 2012, ACM Multimedia.

[10]  Changsheng Xu,et al.  Social event detection with robust high-order co-clustering , 2013, ICMR.

[11]  Harish Katti,et al.  Depth Matters: Influence of Depth Cues on Visual Saliency , 2012, ECCV.

[12]  Yi Yang,et al.  A discriminative CNN video representation for event detection , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Huimin Lu,et al.  Single image dehazing through improved atmospheric light estimation , 2015, Multimedia Tools and Applications.

[14]  K. Madhava Krishna,et al.  Depth really Matters: Improving Visual Salient Region Detection with Depth , 2013, BMVC.

[15]  Liming Zhang,et al.  A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression , 2010, IEEE Transactions on Image Processing.

[16]  Fatih Murat Porikli,et al.  Saliency-aware geodesic video object segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Mohan S. Kankanhalli,et al.  Hierarchical Clustering Multi-Task Learning for Joint Human Action Grouping and Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Huimin Lu,et al.  Underwater scene enhancement using weighted guided median filter , 2014, 2014 IEEE International Conference on Multimedia and Expo (ICME).

[19]  Tongwei Ren,et al.  Salient object detection for RGB-D image via saliency evolution , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).

[20]  Huimin Lu,et al.  Underwater image de-scattering and classification by deep neural network , 2016, Comput. Electr. Eng..

[21]  Changsheng Xu,et al.  Browse by chunks: Topic mining and organizing on web-scale social media , 2011, TOMCCAP.

[22]  H. Zhang,et al.  Multi-perspective and multi-modality joint representation and recognition model for 3D action recognition , 2015, Neurocomputing.

[23]  Rongrong Ji,et al.  RGBD Salient Object Detection: A Benchmark and Algorithms , 2014, ECCV.

[24]  Yan Liu,et al.  Image retargeting based on global energy optimization , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[25]  Ran Ju,et al.  Saliency cuts based on adaptive triple thresholding , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[26]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Yizhou Yu,et al.  Deep Contrast Learning for Salient Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Bin Luo,et al.  Salient Object Detection via Video Spatio-Temporal Difference and Coherence , 2016, 2016 12th International Conference on Computational Intelligence and Security (CIS).

[29]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[30]  Xiang Zhang,et al.  Superpixel-Based Spatiotemporal Saliency Detection , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[31]  Alexander G. Hauptmann,et al.  Which Information Sources are More Effective and Reliable in Video Search , 2016, SIGIR.

[32]  Lei Zhu,et al.  Unsupervised Visual Hashing with Semantic Assistant for Content-Based Image Retrieval , 2017, IEEE Transactions on Knowledge and Data Engineering.

[33]  Luming Zhang,et al.  A Biologically Inspired Automatic System for Media Quality Assessment , 2016, IEEE Transactions on Automation Science and Engineering.

[34]  Huchuan Lu,et al.  Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[36]  Ingrid Heynderickx,et al.  Visual Attention in Objective Image Quality Assessment: Based on Eye-Tracking Data , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[37]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  James M. Rehg,et al.  Video Segmentation by Tracking Many Figure-Ground Segments , 2013, 2013 IEEE International Conference on Computer Vision.

[39]  Yang Liu,et al.  Depth-aware salient object detection using anisotropic center-surround difference , 2015, Signal Process. Image Commun..

[40]  Huchuan Lu,et al.  Saliency Detection via Absorbing Markov Chain , 2013, 2013 IEEE International Conference on Computer Vision.

[41]  Nicu Sebe,et al.  Perceptual Attributes Optimization for Multivideo Summarization , 2016, IEEE Transactions on Cybernetics.

[42]  Zhuowen Tu,et al.  Deeply Supervised Salient Object Detection with Short Connections , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Shuicheng Yan,et al.  Robust Image Analysis With Sparse Representation on Quantized Visual Features , 2013, IEEE Transactions on Image Processing.

[44]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Y. Liu,et al.  Bilinear deep learning for image classification , 2011, ACM Multimedia.

[46]  Ali Borji,et al.  Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[47]  Keith C. C. Chan,et al.  Tensor-based locally maximum margin classifier for image and video classification , 2011, Comput. Vis. Image Underst..

[48]  Xueqing Li,et al.  Leveraging stereopsis for saliency analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Yan Liu,et al.  Unsupervised summarization of rushes videos , 2010, ACM Multimedia.

[51]  Peyman Milanfar,et al.  Static and space-time visual saliency detection by self-resemblance. , 2009, Journal of vision.

[52]  Chun-Rong Huang,et al.  Video Saliency Map Detection by Dominant Camera Motion Removal , 2014, IEEE Transactions on Circuits and Systems for Video Technology.