Video Saliency Detection via Spatial-Temporal Fusion and Low-Rank Coherency Diffusion

This paper advocates a novel video saliency detection method based on the spatial-temporal saliency fusion and low-rank coherency guided saliency diffusion. In sharp contrast to the conventional methods, which conduct saliency detection locally in a frame-by-frame way and could easily give rise to incorrect low-level saliency map, in order to overcome the existing difficulties, this paper proposes to fuse the color saliency based on global motion clues in a batch-wise fashion. And we also propose low-rank coherency guided spatial-temporal saliency diffusion to guarantee the temporal smoothness of saliency maps. Meanwhile, a series of saliency boosting strategies are designed to further improve the saliency accuracy. First, the original long-term video sequence is equally segmented into many short-term frame batches, and the motion clues of the individual video batch are integrated and diffused temporally to facilitate the computation of color saliency. Then, based on the obtained saliency clues, inter-batch saliency priors are modeled to guide the low-level saliency fusion. After that, both the raw color information and the fused low-level saliency are regarded as the low-rank coherency clues, which are employed to guide the spatial-temporal saliency diffusion with the help of an additional permutation matrix serving as the alternative rank selection strategy. Thus, it could guarantee the robustness of the saliency map’s temporal consistence, and further boost the accuracy of the computed saliency map. Moreover, we conduct extensive experiments on five public available benchmarks, and make comprehensive, quantitative evaluations between our method and 16 state-of-the-art techniques. All the results demonstrate the superiority of our method in accuracy, reliability, robustness, and versatility.

[1]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[2]  Xiang Zhang,et al.  Superpixel-Based Spatiotemporal Saliency Detection , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Chang-Su Kim,et al.  Spatiotemporal Saliency Detection for Video Sequences Based on Random Walk With Restart , 2015, IEEE Transactions on Image Processing.

[4]  Ying Wu,et al.  A unified approach to salient object detection via low rank matrix recovery , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[6]  Huchuan Lu,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Online Object Tracking with Sparse Prototypes , 2022 .

[7]  Nanning Zheng,et al.  Automatic salient object segmentation based on context and shape prior , 2011, BMVC.

[8]  Aimin Hao,et al.  Robust multi-modal medical image fusion via anisotropic heat diffusion guided low-rank structural analysis , 2015, Inf. Fusion.

[9]  Junchi Yan,et al.  Visual Saliency Detection via Sparsity Pursuit , 2010, IEEE Signal Processing Letters.

[10]  Peyman Milanfar,et al.  Static and space-time visual saliency detection by self-resemblance. , 2009, Journal of vision.

[11]  Li Xu,et al.  Hierarchical Saliency Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  H. Niemann,et al.  Adaptive change detection for real-time surveillance applications , 2000, Proceedings Third IEEE International Workshop on Visual Surveillance.

[13]  Guillaume-Alexandre Bilodeau,et al.  Change Detection in Feature Space Using Local Binary Similarity Patterns , 2013, 2013 International Conference on Computer and Robot Vision.

[14]  Yael Pritch,et al.  Saliency filters: Contrast based filtering for salient region detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Xiaowei Zhou,et al.  Moving Object Detection by Detecting Contiguous Outliers in the Low-Rank Representation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Aimin Hao,et al.  A parallelized 4D reconstruction algorithm for vascular structures and motions based on energy optimization , 2014, The Visual Computer.

[17]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Lihi Zelnik-Manor,et al.  Context-aware saliency detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[20]  Dong Liang,et al.  Robust Object Detection in Severe Imaging Conditions using Co-Occurrence Background Model , 2014 .

[21]  Feng Zhou,et al.  Time-Mapping Using Space-Time Saliency , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Jitendra Malik,et al.  Object Segmentation by Long Term Analysis of Point Trajectories , 2010, ECCV.

[23]  Rui Wang,et al.  Static and Moving Object Detection Using Flux Tensor with Split Gaussian Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[24]  Atsushi Nakazawa,et al.  Motion Coherent Tracking Using Multi-label MRF Optimization , 2012, International Journal of Computer Vision.

[25]  Aimin Hao,et al.  Structure-Sensitive Saliency Detection via Multilevel Rank Analysis in Intrinsic Feature Space , 2015, IEEE Transactions on Image Processing.

[26]  Aimin Hao,et al.  Real-time and robust object tracking in video via low-rank coherency analysis in feature space , 2015, Pattern Recognit..

[27]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[28]  Loong Fah Cheong,et al.  Block-Sparse RPCA for Salient Motion Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Xiaochun Cao,et al.  Cluster-Based Co-Saliency Detection , 2013, IEEE Transactions on Image Processing.

[30]  Aimin Hao,et al.  Robust salient motion detection in non-stationary videos via novel integrated strategies of spatio-temporal coherency clues and low-rank analysis , 2016, Pattern Recognit..

[31]  Leonidas J. Guibas,et al.  Shape google: Geometric words and expressions for invariant shape retrieval , 2011, TOGS.

[32]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[33]  Huchuan Lu,et al.  Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Hongdong Li,et al.  Robust Motion Segmentation with Unknown Correspondences , 2014, ECCV.

[35]  Dong Xu,et al.  Finding Correspondence from Multiple Images via Sparse and Low-Rank Decomposition , 2012, ECCV.

[36]  Huchuan Lu,et al.  Bayesian Saliency via Low and mid Level Cues , 2022 .

[37]  John Wright,et al.  RASL: Robust alignment by sparse and low-rank decomposition for linearly correlated images , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[38]  Esa Rahtu,et al.  Segmenting Salient Objects from Images and Videos , 2010, ECCV.

[39]  Ling-Yu Duan,et al.  Finding the Secret of Image Saliency in the Frequency Domain , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Ling Shao,et al.  Consistent Video Saliency Using Local Gradient Flow Optimization and Global Refinement , 2015, IEEE Transactions on Image Processing.

[41]  John Wright,et al.  Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization , 2009, NIPS.

[42]  Junji Yamato,et al.  Saliency-based video segmentation with graph cuts and sequentially updated priors , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[43]  Zhou Wang,et al.  Video saliency incorporating spatiotemporal cues and uncertainty weighting , 2013, ICME.

[44]  Huchuan Lu,et al.  Saliency Detection via Dense and Sparse Reconstruction , 2013, 2013 IEEE International Conference on Computer Vision.

[45]  D. Robinson The mechanics of human saccadic eye movement , 1964, The Journal of physiology.

[46]  Mubarak Shah,et al.  Video Object Segmentation through Spatially Accurate and Temporally Dense Extraction of Primary Object Regions , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Cristian Sminchisescu,et al.  Dynamic Eye Movement Datasets and Learnt Saliency Models for Visual Action Recognition , 2012, ECCV.

[48]  James M. Rehg,et al.  Video Segmentation by Tracking Many Figure-Ground Segments , 2013, 2013 IEEE International Conference on Computer Vision.

[49]  Fatih Murat Porikli,et al.  Saliency-aware geodesic video object segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Guillaume-Alexandre Bilodeau,et al.  SuBSENSE: A Universal Change Detection Method With Local Adaptive Sensitivity , 2015, IEEE Transactions on Image Processing.

[51]  Yan Liu,et al.  Video Saliency Detection via Dynamic Consistent Spatio-Temporal Attention Modelling , 2013, AAAI.

[52]  Manuel M. Oliveira,et al.  Domain transform for edge-aware image and video processing , 2011, SIGGRAPH 2011.

[53]  Huiyu Zhou,et al.  Spatial mixture of Gaussians for dynamic background modelling , 2013, 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[54]  S. Süsstrunk,et al.  SLIC Superpixels ? , 2010 .

[55]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  João M. F. Xavier,et al.  Optimal point correspondence through the use of rank constraints , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).