Disentangling top-down vs. bottom-up and low-level vs. high-level influences on eye movements over time

Bottom-up and top-down, as well as low-level and high-level factors influence where we fixate when viewing natural scenes. However, the importance of each of these factors and how they interact remains a matter of debate. Here, we disentangle these factors by analysing their influence over time. For this purpose we develop a saliency model which is based on the internal representation of a recent early spatial vision model to measure the low-level bottom-up factor. To measure the influence of high-level bottom-up features, we use a recent DNN-based saliency model. To account for top-down influences, we evaluate the models on two large datasets with different tasks: first, a memorisation task and, second, a search task. Our results lend support to a separation of visual scene exploration into three phases: The first saccade, an initial guided exploration characterised by a gradual broadening of the fixation density, and an steady state which is reached after roughly 10 fixations. Saccade target selection during the initial exploration and in the steady state are related to similar areas of interest, which are better predicted when including high-level features. In the search dataset, fixation locations are determined predominantly by top-down processes. In contrast, the first fixation follows a different fixation density and contains a strong central fixation bias. Nonetheless, first fixations are guided strongly by image properties and as early as 200 ms after image onset, fixations are better predicted by high-level information. We conclude that any low-level bottom-up factors are mainly limited to the generation of the first saccade. All saccades are better explained when high-level features are considered, and later this high-level bottom-up control can be overruled by top-down influences.

[1]  S. Yantis,et al.  Abrupt visual onsets and selective attention: voluntary versus automatic allocation. , 1990, Journal of experimental psychology. Human perception and performance.

[2]  T. Foulsham,et al.  What can saliency models predict about eye movements? Spatial and sequential aspects of fixations during encoding and recognition. , 2008, Journal of vision.

[3]  H. Müller,et al.  Visual search and selective attention , 2006 .

[4]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[5]  I. Rentschler,et al.  Peripheral vision and pattern recognition: a review. , 2011, Journal of vision.

[6]  Benjamin W. Tatler,et al.  Systematic tendencies in scene viewing , 2008 .

[7]  Simon Barthelmé,et al.  Spatial statistics and attentional dynamics in scene viewing. , 2014, Journal of vision.

[8]  Felix A. Wichmann,et al.  Influence of initial fixation position in scene viewing , 2016, Vision Research.

[9]  M. Land,et al.  The Roles of Vision and Eye Movements in the Control of Activities of Daily Living , 1998, Perception.

[10]  N. Emery,et al.  The eyes have it: the neuroethology, function and evolution of social gaze , 2000, Neuroscience & Biobehavioral Reviews.

[11]  Nicola C. Anderson,et al.  The influence of a scene preview on eye movement behavior in natural scenes , 2016, Psychonomic bulletin & review.

[12]  Heiko H Schütt,et al.  An image-computable psychophysical spatial vision model. , 2017, Journal of vision.

[13]  P. Whittle Increments and decrements: Luminance discrimination , 1986, Vision Research.

[14]  P. Perona,et al.  Objects predict fixations better than early saliency. , 2008, Journal of vision.

[15]  Ali Borji,et al.  State-of-the-Art in Visual Attention Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Noel E. O'Connor,et al.  SalGAN: Visual Saliency Prediction with Generative Adversarial Networks , 2017, ArXiv.

[17]  A. King,et al.  The superior colliculus , 2004, Current Biology.

[18]  W. Einhäuser,et al.  Overt attention in natural scenes: Objects dominate features , 2015, Vision Research.

[19]  Xin Chen,et al.  Real-world visual search is dominated by top-down guidance , 2006, Vision Research.

[20]  Amelia R Hunt,et al.  The saccadic flow baseline: Accounting for image-independent biases in fixation behavior. , 2017, Journal of vision.

[21]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[22]  Matthias Bethge,et al.  DeepGaze II: Reading fixations from deep features trained on object recognition , 2016, ArXiv.

[23]  Zhi Liu,et al.  Saccadic model of eye movements for free-viewing condition , 2015, Vision Research.

[24]  Christopher D. Carello,et al.  Manipulating Intent Evidence for a Causal Role of the Superior Colliculus in Target Selection , 2004, Neuron.

[25]  Ralf Engbert,et al.  Temporal evolution of the central fixation bias in scene viewing. , 2016, Journal of vision.

[26]  Simon Barthelmé,et al.  Modeling fixation locations using spatial point processes. , 2012, Journal of vision.

[27]  Benjamin W Tatler,et al.  The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. , 2007, Journal of vision.

[28]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[29]  P. E. Hallett,et al.  Primary and secondary saccades to goals defined by instructions , 1978, Vision Research.

[30]  D. Ballard,et al.  Eye guidance in natural vision: reinterpreting salience. , 2011, Journal of vision.

[31]  P. König,et al.  Effects of luminance contrast and its modifications on fixation behavior during free viewing of images from different categories , 2009, Vision Research.

[32]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34]  Wilson S. Geisler,et al.  Simple summation rule for optimal fixation selection in visual search , 2009, Vision Research.

[35]  Matthias Bethge,et al.  Information-theoretic model comparison unifies saliency metrics , 2015, Proceedings of the National Academy of Sciences.

[36]  Thomas Martinetz,et al.  Variability of eye movements when viewing dynamic natural scenes. , 2010, Journal of vision.

[37]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[38]  D. Munoz,et al.  Look away: the anti-saccade task and the voluntary control of eye movement , 2004, Nature Reviews Neuroscience.

[39]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[40]  J. Henderson,et al.  The effects of semantic consistency on eye movements during complex scene viewing , 1999 .

[41]  B. Tatler,et al.  The prominence of behavioural biases in eye guidance , 2009 .

[42]  Bernhard Schölkopf,et al.  Center-surround patterns emerge as optimal predictors for human saccade targets. , 2009, Journal of vision.

[43]  Ralf Engbert,et al.  Microsaccades are triggered by low retinal image slip. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[44]  Heiko H. Schütt,et al.  Likelihood-Based Parameter Estimation and Comparison of Dynamical Cognitive Models , 2016, Psychological review.

[45]  Marcus Nyström,et al.  An adaptive algorithm for fixation, saccade, and glissade detection in eyetracking data , 2010, Behavior research methods.

[46]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[47]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[48]  A. L. Yarbus Eye Movements During Perception of Complex Objects , 1967 .

[49]  Laurent Itti,et al.  Superior colliculus neurons encode a visual saliency map during free viewing of natural dynamic video , 2017, Nature Communications.

[50]  Qi Zhao,et al.  SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[51]  Nikolaus Kriegeskorte,et al.  Deep neural networks: a new framework for modelling biological vision and brain information processing , 2015, bioRxiv.

[52]  Michael L. Mack,et al.  Viewing task influences eye movement control during active scene perception. , 2009, Journal of vision.

[53]  Ralf Engbert,et al.  Searchers adjust their eye-movement dynamics to target characteristics in natural scenes , 2019, Scientific Reports.

[54]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[55]  Alexander C. Schütz,et al.  Dynamic integration of information about salience and value for saccadic eye movements , 2012, Proceedings of the National Academy of Sciences.

[56]  James R. Brockmole,et al.  LATEST: A Model of Saccadic Decisions in Space and Time , 2017, Psychological review.

[57]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[58]  T. Foulsham,et al.  Eye movements during scene inspection: A test of the saliency map hypothesis , 2006 .

[59]  Peter König,et al.  Saccadic Momentum and Facilitation of Return Saccades Contribute to an Optimal Foraging Strategy , 2013, PLoS Comput. Biol..

[60]  Laurent Itti,et al.  Superior colliculus encodes visual saliency before the primary visual cortex , 2017, Proceedings of the National Academy of Sciences.

[61]  Ralf Engbert,et al.  Microsaccades uncover the orientation of covert attention , 2003, Vision Research.

[62]  S. Yantis,et al.  Uniqueness of abrupt visual onset in capturing attention , 1988, Perception & psychophysics.

[63]  R. Venkatesh Babu,et al.  DeepFix: A Fully Convolutional Neural Network for Predicting Human Eye Fixations , 2015, IEEE Transactions on Image Processing.

[64]  C. Koch,et al.  Task-demands can immediately reverse the effects of sensory-driven saliency in complex visual stimuli. , 2008, Journal of vision.

[65]  Jiri Najemnik,et al.  Eye movement statistics in humans are consistent with an optimal search strategy. , 2008, Journal of vision.

[66]  Michael L. Mack,et al.  VISUAL SALIENCY DOES NOT ACCOUNT FOR EYE MOVEMENTS DURING VISUAL SEARCH IN REAL-WORLD SCENES , 2007 .

[67]  C. Klein,et al.  Development of prosaccade and antisaccade task performance in participants aged 6 to 26 years. , 2001, Psychophysiology.

[68]  Tom Foulsham,et al.  Turning the world around: Patterns in saccade direction vary with picture orientation , 2008, Vision Research.

[69]  W. Einhäuser,et al.  Attention in natural scenes: Affective-motivational factors guide gaze independently of visual salience , 2017, Vision Research.

[70]  Frédo Durand,et al.  A Benchmark of Computational Models of Saliency to Predict Human Fixations , 2012 .

[71]  A. Kingstone,et al.  The eyes have it! Reflexive orienting is triggered by nonpredictive gaze , 1998 .

[72]  C Koch,et al.  Revisiting spatial vision: toward a unifying model. , 2000, Journal of the Optical Society of America. A, Optics, image science, and vision.

[73]  C. Erkelens,et al.  Coarse-to-fine eye movement strategy in visual search , 2007, Vision Research.

[74]  B. Fischer,et al.  The recognition and correction of involuntary prosaccades in an antisaccade task , 1999, Experimental Brain Research.

[75]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[76]  R. Baddeley,et al.  Do we look at lights? Using mixture modelling to distinguish between low- and high-level factors in natural image viewing , 2009 .

[77]  Nicola C. Anderson,et al.  It depends on when you look at it: Salience influences eye movements in natural scene viewing and search early in time. , 2015, Journal of vision.