Regularized Feature Reconstruction for Spatio-Temporal Saliency Detection

Multimedia applications such as image or video retrieval, copy detection, and so forth can benefit from saliency detection, which is essentially a method to identify areas in images and videos that capture the attention of the human visual system. In this paper, we propose a new spatio-temporal saliency detection framework on the basis of regularized feature reconstruction. Specifically, for video saliency detection, both the temporal and spatial saliency detection are considered. For temporal saliency, we model the movement of the target patch as a reconstruction process using the patches in neighboring frames. A Laplacian smoothing term is introduced to model the coherent motion trajectories. With psychological findings that abrupt stimulus could cause a rapid and involuntary deployment of attention, our temporal model combines the reconstruction error, regularizer, and local trajectory contrast to measure the temporal saliency. For spatial saliency, a similar sparse reconstruction process is adopted to capture the regions with high center-surround contrast. Finally, the temporal saliency and spatial saliency are combined together to favor salient regions with high confidence for video saliency detection. We also apply the spatial saliency part of the spatio-temporal model to image saliency detection. Experimental results on a human fixation video dataset and an image saliency detection dataset show that our method achieves the best performance over several state-of-the-art approaches.

[1]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[2]  Deepu Rajan,et al.  Sustained Observability for Salient Motion Detection , 2010, ACCV.

[3]  Nuno Vasconcelos,et al.  Spatiotemporal Saliency in Dynamic Scenes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[5]  J. Henderson,et al.  Prioritization of new objects in real-world scenes: evidence from eye movements. , 2005, Journal of experimental psychology. Human perception and performance.

[6]  S. Yantis,et al.  Abrupt visual onsets and selective attention: evidence from visual search. , 1984, Journal of experimental psychology. Human perception and performance.

[7]  Matti Pietikäinen,et al.  Modeling pixel process with scale invariant local patterns for background subtraction in complex scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Deepu Rajan,et al.  Random walks on graphs to model saliency in images , 2009, CVPR.

[9]  Sabine Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11]  Wonjun Kim,et al.  Spatiotemporal Saliency Detection and Its Applications in Static and Dynamic Scenes , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Patrick Pérez,et al.  Detection and segmentation of moving objects in highly dynamic scenes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Nathalie Guyader,et al.  Modelling Spatio-Temporal Saliency to Predict Gaze Direction for Short Videos , 2009, International Journal of Computer Vision.

[14]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[15]  Marko Heikkilä,et al.  A texture-based method for modeling the background and detecting moving objects , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Peyman Milanfar,et al.  Static and space-time visual saliency detection by self-resemblance. , 2009, Journal of vision.

[17]  J. Henderson,et al.  Prioritizing new objects for eye fixation in real-world scenes: Effects of object–scene consistency , 2008 .

[18]  Liming Zhang,et al.  Spatio-temporal Saliency detection using phase spectrum of quaternion fourier transform , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  L. Wixson Detecting Salient Motion by Accumulating Directionally-Consistent Flow , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Ramesh C. Jain,et al.  Difference and accumulative difference pictures in dynamic scene analysis , 1984, Image Vis. Comput..

[21]  Stuart J. Russell,et al.  Image Segmentation in Video Sequences: A Probabilistic Approach , 1997, UAI.

[22]  Liang-Tien Chia,et al.  Spatiotemporal Saliency Detection via Sparse Representation , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[23]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[24]  Lihi Zelnik-Manor,et al.  Context-aware saliency detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Yin Li,et al.  Incremental sparse saliency detection , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[26]  Lei Zhang,et al.  Sparse representation or collaborative representation: Which helps face recognition? , 2011, 2011 International Conference on Computer Vision.

[27]  Esa Rahtu,et al.  Segmenting Salient Objects from Images and Videos , 2010, ECCV.

[28]  L. Itti Author address: , 1999 .

[29]  Mubarak Shah,et al.  Visual attention detection in video sequences using spatiotemporal cues , 2006, MM '06.

[30]  M Coltheart,et al.  The persistences of vision. , 1980, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[31]  Liang-Tien Chia,et al.  Laplacian Sparse Coding, Hypergraph Laplacian Sparse Coding, and Applications , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Antonio Torralba,et al.  SIFT Flow: Dense Correspondence across Scenes and Its Applications , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Pierre Baldi,et al.  A principled approach to detecting surprising events in video , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[34]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  V. Bruce,et al.  Visual Perception: Physiology, Psychology and Ecology , 1985 .

[36]  M. Land Visual Perception: Physiology, Psychology and Ecology, Vicki Bruce, Patrick Green. Lawrence Erlbaum, London (1985), xiii, +369. Price £8.95 (paperback) , 1986 .

[37]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[38]  Larry S. Davis,et al.  Non-parametric Model for Background Subtraction , 2000, ECCV.