User-driven saliency maps for evaluating Region-of-Interest detection

Detection of Region of Interest (ROI) in a video leads to more efficient utilization of bandwidth. This is because any ROIs in a given frame can be encoded in higher quality than the rest of that frame, with little or no degradation of quality from the perception of the viewers. Consequently, it is not necessary to uniformly encode the whole video in high quality. One approach to determine ROIs is to use saliency detectors to locate salient regions. This paper proposes a methodology for obtaining ground truth saliency maps to measure the effectiveness of ROI detection by considering the role of user experience during the labelling process of such maps. User perceptions can be captured and incorporated into the definition of salience in a particular video, taking advantage of human visual recall within a given context. Experiments with two state-of-the-art saliency detectors validate the effectiveness of this approach to validating visual saliency in video. This paper will provide the relevant datasets associated with the experiments.

[1]  Pierre Baldi,et al.  A principled approach to detecting surprising events in video , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[2]  Takeo Kanade,et al.  Name-It: Naming and Detecting Faces in News Videos , 1999, IEEE Multim..

[3]  Ming-Chieh Chi,et al.  ROI video coding based on H.263+ with robust skin-color detection technique , 2003, IEEE Trans. Consumer Electron..

[4]  Anselm L. Strauss,et al.  Basics of qualitative research : techniques and procedures for developing grounded theory , 1998 .

[5]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  L. Itti,et al.  Modeling the influence of task on attention , 2005, Vision Research.

[7]  Alvin Raj,et al.  Statistical Saliency Model incorporating motion saliency and an application to driving , 2008 .

[8]  Wei Song,et al.  Impact of zooming and enhancing region of interests for optimizing user experience on mobile sports video , 2010, ACM Multimedia.

[9]  Liming Zhang,et al.  A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression , 2010, IEEE Transactions on Image Processing.

[10]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[11]  King Ngi Ngan,et al.  Face segmentation using skin-color map in videophone applications , 1999, IEEE Trans. Circuits Syst. Video Technol..

[12]  Wen-Huang Cheng,et al.  Automatic video region-of-interest determination based on user attention model , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[13]  Liming Zhang,et al.  Spatio-temporal Saliency detection using phase spectrum of quaternion fourier transform , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[15]  Bruce A. Draper,et al.  Evaluation of selective attention under similarity transformations , 2005, Comput. Vis. Image Underst..

[16]  Wen Gao,et al.  A dataset and evaluation methodology for visual saliency in video , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[17]  Michael S. Fine,et al.  Visual Salience Affects Performance in a Working Memory Task , 2009, The Journal of Neuroscience.

[18]  Paulo Martins Engel,et al.  Evaluation of visual attention models under 2D similarity transformations , 2009, SAC '09.

[19]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[21]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[22]  B. Draper,et al.  Evaluation of Selective Attention under Similarity Transforms , 2003 .

[23]  Thomas Wiegand,et al.  Draft ITU-T recommendation and final draft international standard of joint video specification , 2003 .

[24]  N. Hoffart Basics of Qualitative Research: Techniques and Procedures for Developing Grounded Theory , 2000 .