Medium Spatial Frequencies, a Strong Predictor of Salience

The extent to which so-called low-level features are relevant to predict gaze allocation has been widely studied recently. However, the conclusions are contradictory. Edges and luminance contrasts seem to be always involved, but literature is conflicting about contribution of the different spatial scales. It appears that experiments using man-made scenes lead to the conclusion that fixation location can be efficiently discriminated using high-frequency information, whereas mid- or low frequencies are more discriminative for natural scenes. This paper focuses on the importance of spatial scale to predict visual attention. We propose a fast attentional model and study which frequency band predicts the best fixation locations during free-viewing task. An eye-tracking experiment has been conducted using different scene categories defined by their Fourier spectrums (Coast, OpenCountry, Mountain, and Street). We found that medium frequencies (0.7–1.3 cycles per degree) globally allowed the best prediction of attention, with variability among categories. Fixation locations were found to be more predictable using medium to high frequencies in man-made street scenes and low to medium frequencies in natural landscape scenes.

[1]  Roland J. Baddeley,et al.  High frequency edges (but not contrast) predict where we fixate: A Bayesian system identification analysis , 2006, Vision Research.

[2]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[3]  Abdelhakim Saadane,et al.  Image coding in the context of a psychovisual image representation with vector quantization , 1995, Proceedings., International Conference on Image Processing.

[4]  P Reinagel,et al.  Natural scene statistics at the centre of gaze. , 1999, Network.

[5]  D. S. Wooding,et al.  Automatic control of saccadic eye movements made in visual inspection of briefly presented 2-D images. , 1995, Spatial vision.

[6]  V. Billock Neural acclimation to 1/ f spatial frequency spectra in natural images transduced by the human visual system , 2000 .

[7]  Patrick Le Callet,et al.  A coherent computational approach to model bottom-up visual attention , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[9]  Tim K Marks,et al.  SUN: A Bayesian framework for saliency using natural statistics. , 2008, Journal of vision.

[10]  J. Hegdé Time course of visual perception: Coarse-to-fine processing and beyond , 2008, Progress in Neurobiology.

[11]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[12]  Iain D. Gilchrist,et al.  Visual correlates of fixation selection: effects of scale and time , 2005, Vision Research.

[13]  T. Jost,et al.  Cue Normalization Schemes in Saliency-based Visual Attention Models , 2006 .

[14]  Andrew B. Watson,et al.  The cortex transform: rapid computation of simulated neural images , 1987 .

[15]  P. König,et al.  Effects of luminance contrast and its modifications on fixation behavior during free viewing of images from different categories , 2009, Vision Research.

[16]  B Séré,et al.  Nonhomogeneous Resolution of Images of Natural Scenes , 2000, Perception.

[17]  Antonio Torralba,et al.  Statistics of natural image categories , 2003, Network.

[18]  Benjamin W Tatler,et al.  The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. , 2007, Journal of vision.

[19]  L. Itti,et al.  Modeling the influence of task on attention , 2005, Vision Research.

[20]  Hao Sun,et al.  The temporal properties of the response of macaque ganglion cells and central mechanisms of flicker detection. , 2007, Journal of vision.

[21]  S. Thorpe,et al.  Speed of processing in the human visual system , 1996, Nature.

[22]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[23]  O. Meur,et al.  Predicting visual fixations on video based on low-level visual features , 2007, Vision Research.

[24]  A. Oliva,et al.  From Blobs to Boundary Edges: Evidence for Time- and Spatial-Scale-Dependent Scene Recognition , 1994 .

[25]  Neil D. B. Bruce Features that draw visual attention: an information theoretic perspective , 2005, Neurocomputing.

[26]  Asha Iyer,et al.  Components of bottom-up gaze allocation in natural images , 2005, Vision Research.

[27]  Derrick J. Parkhurst,et al.  Modeling the role of salience in the allocation of overt visual attention , 2002, Vision Research.

[28]  John K. Tsotsos,et al.  Visual Correlates of Fixation Selection: A Look at the Spatial Frequency Domain , 2007, 2007 IEEE International Conference on Image Processing.

[29]  Abel G. Oliva,et al.  Gist of a scene , 2005 .

[30]  Patrick Le Callet,et al.  What we see is most likely to be what matters: Visual attention and applications , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[31]  Nathalie Guyader,et al.  The coarse-to-fine hypothesis revisited: Evidence from neuro-computational modeling , 2005, Brain and Cognition.

[32]  K. Rayner Eye movements in reading and information processing: 20 years of research. , 1998, Psychological bulletin.

[33]  Katsumi Aoki,et al.  Recent development of flow visualization , 2004, J. Vis..