Visual Attention is Beyond One Single Saliency Map

Of later years, numerous bottom-up attention models have been proposed on different assumptions. However, the produced saliency maps may be different from each other even from the same input image. We also observe that human fixation map varies across time greatly. When people freely view an image, they tend to allocate attention at salient regions of large scale at first, and then search more and more detailed regions. In this paper, we argue that, for one input image visual attention cannot be described by only one single saliency map, and this mechanism should be modeled as a dynamic process. Under the frequency domain paradigm, we proposed a global inhibition model to mimic this process by suppressing the {\it non-saliency} in the input image; we also show that the dynamic process is influenced by one parameter in the frequency domain. Experiments illustrate that the proposed model is capable of predicting human dynamic fixation distribution.

[1]  Nuno Vasconcelos,et al.  The discriminant center-surround hypothesis for bottom-up saliency , 2007, NIPS.

[2]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Laurent Itti,et al.  Congruence between model and human attention reveals unique signatures of critical visual events , 2007, NIPS.

[4]  Liqing Zhang,et al.  Dynamic visual attention: searching for coding length increments , 2008, NIPS.

[5]  Liming Zhang,et al.  Spatio-temporal Saliency detection using phase spectrum of quaternion fourier transform , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[7]  S. Kastner,et al.  Stimulus context modulates competition in human extrastriate cortex , 2005, Nature Neuroscience.

[8]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[9]  John K. Tsotsos,et al.  Saliency, attention, and visual search: an information theoretic approach. , 2009, Journal of vision.

[10]  G. Mangun,et al.  The neural mechanisms of top-down attentional control , 2000, Nature Neuroscience.

[11]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[12]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[13]  Christof Koch,et al.  Predicting human gaze using low-level saliency combined with face detection , 2007, NIPS.

[14]  Nuno Vasconcelos,et al.  Bottom-up saliency is a discriminant process , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[15]  Ivan N Pigarev,et al.  Neural Mechanisms of Visual Attention: How Top-Down Feedback Highlights Relevant Locations , 2007, Science.

[16]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .