Multi-feature based visual saliency detection in surveillance video

The perception of video is different from that of image because of the motion information in video. Motion objects lead to the difference between two neighboring frames which is usually focused on. By far, most papers have contributed to image saliency but seldom to video saliency. Based on scene understanding, a new video saliency detection model with multi-features is proposed in this paper. First, background is extracted based on binary tree searching, then main features in the foreground is analyzed using a multi-scale perception model. The perception model integrates faces as a high level feature, as a supplement to other low-level features such as color, intensity and orientation. Motion saliency map is calculated using the statistic of the motion vector field. Finally, multi-feature conspicuities are merged with different weights. Compared with the gaze map from subjective experiments, the output of the multi-feature based video saliency detection model is close to gaze map.

[1]  HongJiang Zhang,et al.  A model of motion attention for video skimming , 2002, Proceedings. International Conference on Image Processing.

[2]  David Suter,et al.  A Novel Robust Statistical Method for Background Initialization and Visual Surveillance , 2006, ACCV.

[3]  Nathalie Guyader,et al.  Modelling Spatio-Temporal Saliency to Predict Gaze Direction for Short Videos , 2009, International Journal of Computer Vision.

[4]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[5]  Alan C. Bovik,et al.  GAFFE: A Gaze-Attentive Fixation Finding Engine , 2008, IEEE Transactions on Image Processing.

[6]  Shan Li,et al.  Fast Visual Tracking using Motion Saliency in Video , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[7]  Brian Scassellati,et al.  Theory of Mind for a Humanoid Robot , 2002, Auton. Robots.

[8]  William Puech,et al.  Automatic background generation from a sequence of images based on robust mode estimation , 2009, Electronic Imaging.

[9]  Fei-Fei Li,et al.  What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[10]  Heinz Hügli,et al.  Assessing the contribution of color in visual attention , 2005, Comput. Vis. Image Underst..

[11]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[12]  R. Desimone,et al.  Stimulus-selective properties of inferior temporal neurons in the macaque , 1984, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[13]  Liming Zhang,et al.  Saliency-Based Image Quality Assessment Criterion , 2008, ICIC.

[14]  Nathalie Guyader,et al.  Spatio-temporal saliency model to predict eye movements in video free viewing , 2008, 2008 16th European Signal Processing Conference.

[15]  Christof Koch,et al.  Modeling attention to salient proto-objects , 2006, Neural Networks.