Comparison of visual saliency models for compressed video

Visual saliency modeling is an increasingly important research problem. While most saliency models for dynamic scenes operate on raw video, several models have also been developed for compressed video. This paper compares the accuracy of nine such models on a common eye-tracking dataset. The results indicate that a reasonably accurate saliency estimation is possible even using only motion vectors from the compressed bitstream. Successful strategies in compressed-domain saliency modeling are highlighted, and certain challenges are identified for future improvement.

[1]  Alan C. Bovik,et al.  Visual Importance Pooling for Image Quality Assessment , 2009, IEEE Journal of Selected Topics in Signal Processing.

[2]  Mohamed-Chaker Larabi,et al.  Camera motion influence on dynamic saliency central bias , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Ivan V. Bajić,et al.  Compressed-Domain Global Motion Estimation Based on the ormalized Direct Linear Transform Algorithm , 2013 .

[4]  John A. Swets,et al.  Signal Detection Theory and ROC Analysis in Psychology and Diagnostics: Collected Papers , 1996 .

[5]  Weisi Lin,et al.  A Video Saliency Detection Model in Compressed Domain , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Zhi Liu,et al.  A Motion Attention Model Based Rate Control Algorithm for H.264/AVC , 2009, 2009 Eighth IEEE/ACIS International Conference on Computer and Information Science.

[7]  Hong Ren Wu,et al.  Digital Video Image Quality and Perceptual Coding , 2005 .

[8]  Ivan V. Bajic,et al.  Saliency-Aware Video Compression , 2014, IEEE Transactions on Image Processing.

[9]  Zhou Wang,et al.  Foveated Image and Video Coding , 2004 .

[10]  Nuno Vasconcelos,et al.  Spatiotemporal Saliency in Dynamic Scenes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Matthew H Tong,et al.  SUN: Top-down saliency using natural statistics , 2009, Visual cognition.

[12]  Yu Huang,et al.  Video retargeting with nonlinear spatial-temporal saliency fusion , 2010, 2010 IEEE International Conference on Image Processing.

[13]  Songyu Yu,et al.  Region-of-interest-based video transcoding from MPEG-2 to H.264 in the compressed domain , 2008 .

[14]  Ulrich Engelke,et al.  Visual Attention in Quality Assessment , 2011, IEEE Signal Processing Magazine.

[15]  Aniruddha Sinha,et al.  A fast algorithm to find the region-of-interest in the compressed MPEG domain , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[16]  Laurent Itti,et al.  Visual attention guided bit allocation in video compression , 2011, Image Vis. Comput..

[17]  Zygmunt Pizlo,et al.  A study on the effect of camera motion on human visual attention , 2008, 2008 15th IEEE International Conference on Image Processing.

[18]  Iain D. Gilchrist,et al.  Visual correlates of fixation selection: effects of scale and time , 2005, Vision Research.

[19]  Ivan V. Bajic,et al.  Eye-Tracking Database for a Set of Standard Video Sequences , 2012, IEEE Transactions on Image Processing.

[20]  Derrick J. Parkhurst,et al.  Scene content selected by active vision. , 2003, Spatial vision.

[21]  A. Tamhane,et al.  Multiple Comparison Procedures , 1989 .

[22]  Matthew H Tong,et al.  SUN: Top-down saliency using natural statistics , 2009, Visual cognition.

[23]  Tim K Marks,et al.  SUN: A Bayesian framework for saliency using natural statistics. , 2008, Journal of vision.

[24]  Deepu Rajan,et al.  Salient Motion Detection in Compressed Domain , 2013, IEEE Signal Processing Letters.

[25]  Aniruddha Sinha,et al.  Region-of-interest based compressed domain video transcoding scheme , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[26]  HongJiang Zhang,et al.  A model of motion attention for video skimming , 2002, Proceedings. International Conference on Image Processing.

[27]  Eric C. Larson,et al.  Can visual fixation patterns improve image fidelity assessment? , 2008, 2008 15th IEEE International Conference on Image Processing.

[28]  Weisi Lin,et al.  Video saliency detection in the compressed domain , 2012, ACM Multimedia.

[29]  Weisi Lin,et al.  Saliency Detection in the Compressed Domain for Adaptive Image Retargeting , 2012, IEEE Transactions on Image Processing.

[30]  A. Tamhane,et al.  Multiple Comparison Procedures , 2009 .

[31]  Ali Borji,et al.  Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study , 2013, IEEE Transactions on Image Processing.

[32]  Liming Zhang,et al.  A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression , 2010, IEEE Transactions on Image Processing.

[33]  Vladimir Zlokolica,et al.  Salient Motion Features for Video Quality Assessment , 2011, IEEE Transactions on Image Processing.

[34]  B Reiser,et al.  Statistical inference for the area under the receiver operating characteristic curve in the presence of random measurement error. , 2001, American journal of epidemiology.

[35]  Hadi Hadizadeh,et al.  Visual Saliency in Video Compression and Transmission , 2013 .

[36]  Yaowu Chen,et al.  No-reference video quality assessment in the compressed domain , 2012, IEEE Transactions on Consumer Electronics.

[37]  HongJiang Zhang,et al.  A new perceived motion based shot content representation , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).