Tag-Saliency: Combining bottom-up and top-down information for saliency detection

In the real world, people often have a habit tending to pay more attention to some things usually noteworthy, while ignore others. This phenomenon is associated with the top-down attention. Modeling this kind of attention has recently raised many interests in computer vision due to a wide range of practical applications. Majority of the existing models are based on eye-tracking or object detection. However, these methods may not apply to practical situations, because the eye movement data cannot be always recorded or there may be inscrutable objects to be handled in large-scale data sets. This paper proposes a Tag-Saliency model based on hierarchical image over-segmentation and auto-tagging, which can efficiently extract semantic information from large scale visual media data. Experimental results on a very challenging data set show that, the proposed Tag-Saliency model has the ability to locate the truly salient regions in a greater probability than other competitors.

[1]  L. Zhaoping Attention capture by eye of origin singletons even without awareness--a hallmark of a bottom-up saliency map in the primary visual cortex. , 2008, Journal of vision.

[2]  Laura A. Dabbish,et al.  Labeling images with a computer game , 2004, AAAI Spring Symposium: Knowledge Collection from Volunteer Contributors.

[3]  L. Itti,et al.  Modeling the influence of task on attention , 2005, Vision Research.

[4]  Chanho Jung,et al.  A Unified Spectral-Domain Approach for Saliency Detection and Its Application to Automatic Object Segmentation , 2012, IEEE Transactions on Image Processing.

[5]  Denis Pellerin,et al.  Video summarization using a visual attention model , 2007, 2007 15th European Signal Processing Conference.

[6]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[7]  Naila Murray,et al.  Saliency estimation using a non-parametric low-level vision model , 2011, CVPR 2011.

[8]  Xuelong Li,et al.  Saliency Detection by Multiple-Instance Learning , 2013, IEEE Transactions on Cybernetics.

[9]  Lei Wu,et al.  Tag Completion for Image Retrieval , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Pietro Perona,et al.  Selective visual attention enables learning and recognition of multiple objects in cluttered scenes , 2005, Comput. Vis. Image Underst..

[12]  HongJiang Zhang,et al.  Contrast-based image attention analysis by using fuzzy growing , 2003, MULTIMEDIA '03.

[13]  Nuno Vasconcelos,et al.  On the plausibility of the discriminant center-surround hypothesis for visual saliency. , 2008, Journal of vision.

[14]  Ling Shao,et al.  Specific object retrieval based on salient regions , 2006, Pattern Recognit..

[15]  Sabine Süsstrunk,et al.  Salient Region Detection and Segmentation , 2008, ICVS.

[16]  W Skrandies,et al.  Human contrast sensitivity: regional retinal differences. , 1985, Human neurobiology.

[17]  Víctor Leborán,et al.  On the relationship between optical variability, visual saliency, and eye fixations: a computational approach. , 2012, Journal of vision.

[18]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[19]  Yael Pritch,et al.  Saliency filters: Contrast based filtering for salient region detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[21]  Ali Borji,et al.  Probabilistic learning of task-specific visual attention , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Antón García-Díaz,et al.  Saliency from hierarchical adaptation through decorrelation and variance normalization , 2012, Image Vis. Comput..

[23]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[24]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[25]  Maria Concetta Morrone,et al.  "Non-retinotopic processing" in Ternus motion displays modeled by spatiotemporal filters. , 2012, Journal of vision.

[26]  King Ngi Ngan,et al.  A Co-Saliency Model of Image Pairs , 2011, IEEE Transactions on Image Processing.

[27]  King Ngi Ngan,et al.  Unsupervised extraction of visual attention objects in color images , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Peyman Milanfar,et al.  Nonparametric bottom-up saliency detection by self-resemblance , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[29]  Lihi Zelnik-Manor,et al.  Context-aware saliency detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Benjamin W Tatler,et al.  The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. , 2007, Journal of vision.

[31]  Nenghai Yu,et al.  Semantics-Preserving Bag-of-Words Models and Applications , 2010, IEEE Transactions on Image Processing.

[32]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Gregory J Zelinsky,et al.  Effects of target typicality on categorical search. , 2014, Journal of vision.

[34]  Laurent Itti,et al.  Beyond bottom-up: Incorporating task-dependent influences into a computational model of spatial attention , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Zi Huang,et al.  Mining multi-tag association for image tagging , 2011, World Wide Web.

[36]  Ling Shao,et al.  Geometric and photometric invariant distinctive regions detection , 2007, Inf. Sci..

[37]  P. Perona,et al.  Objects predict fixations better than early saliency. , 2008, Journal of vision.

[38]  Xuelong Li,et al.  Multi-spectral saliency detection , 2013, Pattern Recognit. Lett..

[39]  Christof Koch,et al.  Predicting human gaze using low-level saliency combined with face detection , 2007, NIPS.

[40]  Miska M. Hannuksela,et al.  Perceptual quality assessment based on visual attention analysis , 2009, ACM Multimedia.

[41]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Laurent Itti,et al.  Interesting objects are visually salient. , 2008, Journal of vision.

[43]  Alan C. Bovik,et al.  Visual Importance Pooling for Image Quality Assessment , 2009, IEEE Journal of Selected Topics in Signal Processing.

[44]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[45]  Pingkun Yan,et al.  Visual Saliency by Selective Contrast , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[46]  George K. I. Mann,et al.  An Object-Based Visual Attention Model for Robotic Applications , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[47]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[48]  Esa Rahtu,et al.  Segmenting Salient Objects from Images and Videos , 2010, ECCV.

[49]  Christof Koch,et al.  Attentional Selection for Object Recognition - A Gentle Way , 2002, Biologically Motivated Computer Vision.

[50]  Mubarak Shah,et al.  Visual attention detection in video sequences using spatiotemporal cues , 2006, MM '06.

[51]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[52]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[53]  Ling Shao,et al.  Invariant salient regions based image retrieval under viewpoint and illumination variations , 2006, J. Vis. Commun. Image Represent..

[54]  L. Itti,et al.  Visual causes versus correlates of attentional selection in dynamic scenes , 2006, Vision Research.

[55]  Ali Borji,et al.  Boosting bottom-up and top-down visual features for saliency estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[57]  Meng Wang,et al.  Image saliency: From intrinsic to extrinsic context , 2011, CVPR 2011.

[58]  Nenghai Yu,et al.  Scale-Invariant Visual Language Modeling for Object Categorization , 2009, IEEE Trans. Multim..

[59]  Zi Huang,et al.  Tag localization with spatial correlations and joint group sparsity , 2011, CVPR 2011.

[60]  Tim K Marks,et al.  SUN: A Bayesian framework for saliency using natural statistics. , 2008, Journal of vision.

[61]  Jian Yu,et al.  Saliency Detection by Multitask Sparsity Pursuit , 2012, IEEE Transactions on Image Processing.