A Functional and Statistical Bottom-Up Saliency Model to Reveal the Relative Contributions of Low-Level Visual Guiding Factors

When looking at a scene, we frequently move our eyes to place consecutive interesting regions on the fovea, the retina centre. At each fixation, only this specific foveal region is analysed in detail by the visual system. The visual attention mechanisms control eye movements and depend on two types of factor: bottom-up and top-down factors. Bottom-up factors include different visual features such as colour, luminance, edges, and orientations. In this paper, we evaluate quantitatively the relative contribution of basic low-level features as candidate guiding factors to visual attention and hence to eye movements. We also study how these visual features can be combined in a bottom-up saliency model. Our work consists of three interactive parts: a functional saliency model, a statistical model and eye movement data recorded during free viewing of natural scenes. The functional saliency model, inspired by the primate visual system, decomposes a visual scene into different feature maps. The statistical model indicates which features best explain the recorded eye movements. We show an essential role of high frequency luminance and an important contribution of central fixation bias. The relative contribution of features, calculated by the statistical model, is then used to combine the different feature maps into a saliency map. Finally, the comparison between the saliency model and experimental data confirmed the influence of these contributions.

[1]  J. Wolfe,et al.  What attributes guide the deployment of visual attention and how do they do it? , 2004, Nature Reviews Neuroscience.

[2]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[3]  P Reinagel,et al.  Natural scene statistics at the centre of gaze. , 1999, Network.

[4]  F. Ohl,et al.  Fallacies in behavioural interpretation of auditory cortex plasticity , 2004, Nature Reviews Neuroscience.

[5]  Zhaoping Li,et al.  Psychophysical Tests of the Hypothesis of a Bottom-Up Saliency Map in Primary Visual Cortex , 2007, PLoS Comput. Biol..

[6]  D. G. Albrecht,et al.  Spatial frequency selectivity of cells in macaque visual cortex , 1982, Vision Research.

[7]  Derrick J. Parkhurst,et al.  Texture contrast attracts overt visual attention in natural scenes , 2004, The European journal of neuroscience.

[8]  Nathalie Guyader,et al.  Modelling Spatio-Temporal Saliency to Predict Gaze Direction for Short Videos , 2009, International Journal of Computer Vision.

[9]  A. L. I︠A︡rbus Eye Movements and Vision , 1967 .

[10]  F. Hamker,et al.  About the influence of post-saccadic mechanisms for visual stability on peri-saccadic compression of object location. , 2008, Journal of vision.

[11]  Heinz Hügli,et al.  Assessing the contribution of color in visual attention , 2005, Comput. Vis. Image Underst..

[12]  K. Gegenfurtner,et al.  Memory modulates color appearance , 2006, Nature Neuroscience.

[13]  K. Fujii,et al.  Visualization for the analysis of fluid motion , 2005, J. Vis..

[14]  Nathalie Guyader,et al.  A Biologically-Inspired Visual Saliency Model to Test Different Strategies of Saccade Programming , 2009, BIOSTEC.

[15]  Roland J. Baddeley,et al.  High frequency edges (but not contrast) predict where we fixate: A Bayesian system identification analysis , 2006, Vision Research.

[16]  S. Schultz Principles of Neural Science, 4th ed. , 2001 .

[17]  A. Mizuno,et al.  A change of the leading player in flow Visualization technique , 2006, J. Vis..

[18]  D. Dacey,et al.  This paper was presented at a colloquium entitled ‘ ‘ Vision : From Photon to Perception , ’ ’ organized by , 1998 .

[19]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[20]  Asha Iyer,et al.  Components of bottom-up gaze allocation in natural images , 2005, Vision Research.

[21]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[22]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[23]  L. Itti,et al.  Modeling the influence of task on attention , 2005, Vision Research.

[24]  A. L. Yarbus,et al.  Eye Movements and Vision , 1967, Springer US.

[25]  Zhaoping Li A saliency map in primary visual cortex , 2002, Trends in Cognitive Sciences.

[26]  Katsumi Aoki,et al.  Recent development of flow visualization , 2004, J. Vis..

[27]  Carl-Fredrik Westin,et al.  Local multiscale frequency and bandwidth estimation , 1994, Proceedings of 1st International Conference on Image Processing.

[28]  I. Ohzawa,et al.  Organization of suppression in receptive fields of neurons in cat visual cortex. , 1992, Journal of neurophysiology.

[29]  K. Mullen The contrast sensitivity of human colour vision to red‐green and blue‐yellow chromatic gratings. , 1985, The Journal of physiology.

[30]  Derrick J. Parkhurst,et al.  Modeling the role of salience in the allocation of overt visual attention , 2002, Vision Research.

[31]  E. Callaway,et al.  Parallel colour-opponent pathways to primary visual cortex , 2003, Nature.

[32]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[33]  Thomas Couronné,et al.  A statistical mixture method to reveal bottom-up and top-down factors guiding the eye-movements , 2010 .

[34]  T. Wiesel,et al.  Relationships between horizontal interactions and functional architecture in cat striate cortex as revealed by cross-correlation analysis , 1986, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[35]  Jeffrey A. Sloan,et al.  Spatial frequency analysis of the visual environment: Anisotropy and the carpentered environment hypothesis , 1978, Vision Research.

[36]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[37]  Laurent Itti,et al.  Applying computational tools to predict gaze direction in interactive visual environments , 2008, TAP.

[38]  Patrick Le Callet,et al.  A coherent computational approach to model bottom-up visual attention , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  L. Itti,et al.  Quantifying center bias of observers in free viewing of dynamic natural scenes. , 2009, Journal of vision.

[40]  C. Gilbert,et al.  Topography of contextual modulations mediated by short-range interactions in primary visual cortex , 1999, Nature.

[41]  Christof Koch,et al.  Attentional effects on contrast detection in the presence of surround masks , 2000, Vision Research.

[42]  D. Dacey,et al.  Colour coding in the primate retina: diverse cell types and cone-specific circuitry , 2003, Current Opinion in Neurobiology.

[43]  J. Rieger,et al.  Sensory and cognitive contributions of color to the recognition of natural scenes , 2000, Current Biology.

[44]  Roland Baddeley,et al.  The Correlational Structure of Natural Images and the Calibration of Spatial Representations , 1997, Cogn. Sci..

[45]  Benjamin W Tatler,et al.  The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. , 2007, Journal of vision.

[46]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[47]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[48]  Peter König,et al.  What's color got to do with it? The influence of color on visual attention in different categories. , 2008, Journal of vision.

[49]  R. Baddeley,et al.  Do we look at lights? Using mixture modelling to distinguish between low- and high-level factors in natural image viewing , 2009 .

[50]  David J. Sakrison,et al.  The effects of a visual fidelity criterion of the encoding of images , 1974, IEEE Trans. Inf. Theory.