Touch Saliency: Characteristics and Prediction

In this work, we propose an alternative ground truth to the eye fixation map in visual attention study, called touch saliency. As it can be directly collected from the recorded data of users' daily browsing behavior on widely used smart phone devices with touch screens, the touch saliency data is easy to obtain. Due to the limited screen size, smart phone users usually move and zoom in the images, and fix the region of interest on the screen when browsing images. Our studies are two-fold. First, we collect and study the characteristics of these touch screen fixation maps (named touch saliency) by comprehensive comparisons with their counterpart, the eye-fixation maps (namely, visual saliency). The comparisons show that the touch saliency is highly correlated with the eye fixations for the same stimuli, which indicates its utility in data collection for visual attention study. Based on the consistency between both touch saliency and visual saliency, our second task is to propose a unified saliency prediction model for both visual and touch saliency detection. This model utilizes middle-level object category features extracted from pre-segmented image superpixels as input to the recently proposed multitask sparsity pursuit (MTSP) framework for saliency prediction. Extensive evaluations show that the proposed middle-level category features can considerably improve the saliency prediction performance when taking both touch saliency and visual saliency as ground truth.

[1]  Christof Koch,et al.  Comparison of feature combination strategies for saliency-based visual attention systems , 1999, Electronic Imaging.

[2]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[3]  Harish Katti,et al.  An Eye Fixation Database for Saliency Detection in Images , 2010, ECCV.

[4]  Yang Li,et al.  Experimental analysis of touch-screen gesture designs in mobile environments , 2011, CHI.

[5]  Meng Wang,et al.  Dynamic captioning: video accessibility enhancement for hearing impairment , 2010, ACM Multimedia.

[6]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[7]  Patrick Le Callet,et al.  A coherent computational approach to model bottom-up visual attention , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Krista A. Ehinger,et al.  Modelling search for people in 900 scenes: A combined source model of eye guidance , 2009 .

[9]  Bingbing Ni,et al.  Touch saliency , 2012, ACM Multimedia.

[10]  Ali Borji,et al.  Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[11]  Deepu Rajan,et al.  Random Walks on Graphs for Salient Object Detection in Images , 2010, IEEE Transactions on Image Processing.

[12]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Iain D. Gilchrist,et al.  Visual correlates of fixation selection: effects of scale and time , 2005, Vision Research.

[14]  Meng Wang,et al.  In-Image Accessibility Indication , 2010, IEEE Transactions on Multimedia.

[15]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[16]  Cristian Sminchisescu,et al.  Semantic Segmentation with Second-Order Pooling , 2012, ECCV.

[17]  Martin D. Levine,et al.  Visual Saliency Based on Scale-Space Analysis in the Frequency Domain , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Liang-Tien Chia,et al.  Detection of visual attention regions in images using robust subspace analysis , 2008, J. Vis. Commun. Image Represent..

[19]  Lihi Zelnik-Manor,et al.  Context-Aware Saliency Detection , 2012, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[21]  Frédo Durand,et al.  A Benchmark of Computational Models of Saliency to Predict Human Fixations , 2012 .

[22]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[23]  Meng Wang,et al.  Video accessibility enhancement for hearing-impaired users , 2011, TOMCCAP.

[24]  Michael Lindenbaum,et al.  Esaliency (Extended Saliency): Meaningful Attention Using Stochastic Image Modeling , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Umesh Rajashekar,et al.  DOVES: a database of visual eye movements. , 2009, Spatial vision.

[27]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Ramesh Raskar,et al.  Automatic image retargeting , 2004, SIGGRAPH '04.

[29]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[30]  John K. Tsotsos,et al.  Saliency, attention, and visual search: an information theoretic approach. , 2009, Journal of vision.

[31]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[32]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Afshin Dehghan,et al.  Improving an Object Detector and Extracting Regions Using Superpixels , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Christof Koch,et al.  Feature combination strategies for saliency-based visual attention systems , 2001, J. Electronic Imaging.

[36]  Christof Koch,et al.  Predicting human gaze using low-level saliency combined with face detection , 2007, NIPS.

[37]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[38]  Tim K Marks,et al.  SUN: A Bayesian framework for saliency using natural statistics. , 2008, Journal of vision.

[39]  Heinz Hügli,et al.  Empirical Validation of the Saliency-based Model of Visual Attention , 2003 .

[40]  Stefano Soatto,et al.  Class segmentation and object localization with superpixel neighborhoods , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[41]  Jian Yu,et al.  Saliency Detection by Multitask Sparsity Pursuit , 2012, IEEE Transactions on Image Processing.

[42]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[43]  William T. Freeman,et al.  Presented at: 2nd Annual IEEE International Conference on Image , 1995 .

[44]  Svetlana Lazebnik,et al.  Superparsing - Scalable Nonparametric Image Parsing with Superpixels , 2010, International Journal of Computer Vision.

[45]  L. Itti,et al.  Search Goal Tunes Visual Features Optimally , 2007, Neuron.