A Video Saliency Detection Model in Compressed Domain

Saliency detection is widely used to extract regions of interest in images for various image processing applications. Recently, many saliency detection models have been proposed for video in uncompressed (pixel) domain. However, video over Internet is always stored in compressed domains, such as MPEG2, H.264, and MPEG4 Visual. In this paper, we propose a novel video saliency detection model based on feature contrast in compressed domain. Four types of features including luminance, color, texture, and motion are extracted from the discrete cosine transform coefficients and motion vectors in video bitstream. The static saliency map of unpredicted frames (I frames) is calculated on the basis of luminance, color, and texture features, while the motion saliency map of predicted frames (P and B frames) is computed by motion feature. A new fusion method is designed to combine the static saliency and motion saliency maps to get the final saliency map for each video frame. Due to the directly derived features in compressed domain, the proposed model can predict the salient regions efficiently for video frames. Experimental results on a public database show superior performance of the proposed video saliency detection model in compressed domain.

[1]  Patrick Le Callet,et al.  Do video coding impairments disturb the visual attention deployment? , 2010, Signal Process. Image Commun..

[2]  Weisi Lin,et al.  Modeling visual attention's modulatory aftereffects on visual sensitivity and quality evaluation , 2005, IEEE Transactions on Image Processing.

[3]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[4]  A Treisman,et al.  Feature analysis in early vision: evidence from search asymmetries. , 1988, Psychological review.

[5]  S. Kullback,et al.  Information Theory and Statistics , 1959 .

[6]  Fatih Porikli Real-time video object segmentation for MPEG-encoded video sequences , 2004, IS&T/SPIE Electronic Imaging.

[7]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[8]  Liming Zhang,et al.  A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression , 2010, IEEE Transactions on Image Processing.

[9]  M F Land,et al.  The knowledge base of the oculomotor system. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[10]  George Economou,et al.  Multivariate image similarity in the compressed domain using statistical graph matching , 2006, Pattern Recognit..

[11]  J. Wolfe,et al.  Changing your mind: on the contributions of top-down and bottom-up guidance in visual search for feature singletons. , 2003, Journal of experimental psychology. Human perception and performance.

[12]  H. Pashler The Psychology of Attention , 1997 .

[13]  R. Talluri,et al.  Error-resilient video coding in the ISO MPEG-4 standard , 1998, IEEE Commun. Mag..

[14]  D. Sagi,et al.  Vision outside the focus of attention , 1990, Perception & psychophysics.

[15]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[16]  Pierre Baldi,et al.  Bayesian surprise attracts human attention , 2005, Vision Research.

[17]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[19]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[20]  Mubarak Shah,et al.  Visual attention detection in video sequences using spatiotemporal cues , 2006, MM '06.

[21]  Weisi Lin,et al.  Saliency Detection in the Compressed Domain for Adaptive Image Retargeting , 2012, IEEE Transactions on Image Processing.

[22]  Touradj Ebrahimi,et al.  MPEG-4 natural video coding - An overview , 2000, Signal Process. Image Commun..

[23]  Lihi Zelnik-Manor,et al.  Context-Aware Saliency Detection , 2012, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Liqing Zhang,et al.  Dynamic visual attention: searching for coding length increments , 2008, NIPS.

[25]  Ronald A. Rensink Seeing, sensing, and scrutinizing , 2000, Vision Research.

[26]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[27]  C. W. Therrien,et al.  Decision, Estimation and Classification: An Introduction to Pattern Recognition and Related Topics , 1989 .

[28]  Weisi Lin,et al.  Saliency-based image retargeting in the compressed domain , 2011, ACM Multimedia.

[29]  Ruey-Feng Chang,et al.  Texture features for DCT-coded image retrieval and classification , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[30]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[31]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[32]  Laurent Itti,et al.  Applying computational tools to predict gaze direction in interactive visual environments , 2008, TAP.

[33]  Christel Chamaret,et al.  Spatio-temporal combination of saliency maps and eye-tracking assessment of different strategies , 2010, 2010 IEEE International Conference on Image Processing.

[34]  K. Nakayama,et al.  On the Functional Role of Implicit Visual Memory for the Adaptive Deployment of Attention Across Scenes , 2000 .

[35]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[36]  Rita Cucchiara,et al.  Performance analysis of MPEG-4 decoder and encoder , 2002, International Symposium on VIPromCom Video/Image Processing and Multimedia Communications.