Disentangling bottom-up versus top-down and low-level versus high-level influences on eye movements over time.

Bottom-up and top-down as well as low-level and high-level factors influence where we fixate when viewing natural scenes. However, the importance of each of these factors and how they interact remains a matter of debate. Here, we disentangle these factors by analyzing their influence over time. For this purpose, we develop a saliency model that is based on the internal representation of a recent early spatial vision model to measure the low-level, bottom-up factor. To measure the influence of high-level, bottom-up features, we use a recent deep neural network-based saliency model. To account for top-down influences, we evaluate the models on two large data sets with different tasks: first, a memorization task and, second, a search task. Our results lend support to a separation of visual scene exploration into three phases: the first saccade, an initial guided exploration characterized by a gradual broadening of the fixation density, and a steady state that is reached after roughly 10 fixations. Saccade-target selection during the initial exploration and in the steady state is related to similar areas of interest, which are better predicted when including high-level features. In the search data set, fixation locations are determined predominantly by top-down processes. In contrast, the first fixation follows a different fixation density and contains a strong central fixation bias. Nonetheless, first fixations are guided strongly by image properties, and as early as 200 ms after image onset, fixations are better predicted by high-level information. We conclude that any low-level, bottom-up factors are mainly limited to the generation of the first saccade. All saccades are better explained when high-level features are considered, and later, this high-level, bottom-up control can be overruled by top-down influences.

[1]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[2]  P. E. Hallett,et al.  Primary and secondary saccades to goals defined by instructions , 1978, Vision Research.

[3]  Alexander C. Schütz,et al.  Dynamic integration of information about salience and value for saccadic eye movements , 2012, Proceedings of the National Academy of Sciences.

[4]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[5]  T. Foulsham,et al.  Eye movements during scene inspection: A test of the saliency map hypothesis , 2006 .

[6]  P. König,et al.  Effects of luminance contrast and its modifications on fixation behavior during free viewing of images from different categories , 2009, Vision Research.

[7]  P. Whittle Increments and decrements: Luminance discrimination , 1986, Vision Research.

[8]  Derrick J. Parkhurst,et al.  Modeling the role of salience in the allocation of overt visual attention , 2002, Vision Research.

[9]  C. Erkelens,et al.  Coarse-to-fine eye movement strategy in visual search , 2007, Vision Research.

[10]  B. Fischer,et al.  The recognition and correction of involuntary prosaccades in an antisaccade task , 1999, Experimental Brain Research.

[11]  Felix A. Wichmann,et al.  Influence of initial fixation position in scene viewing , 2016, Vision Research.

[12]  Wilson S. Geisler,et al.  Simple summation rule for optimal fixation selection in visual search , 2009, Vision Research.

[13]  J. Henderson,et al.  The effects of semantic consistency on eye movements during complex scene viewing , 1999 .

[14]  L. Itti,et al.  Modeling the influence of task on attention , 2005, Vision Research.

[15]  Bernhard Schölkopf,et al.  Center-surround patterns emerge as optimal predictors for human saccade targets. , 2009, Journal of vision.

[16]  Matthias Bethge,et al.  Information-theoretic model comparison unifies saliency metrics , 2015, Proceedings of the National Academy of Sciences.

[17]  Thomas Martinetz,et al.  Variability of eye movements when viewing dynamic natural scenes. , 2010, Journal of vision.

[18]  Peter König,et al.  Saccadic Momentum and Facilitation of Return Saccades Contribute to an Optimal Foraging Strategy , 2013, PLoS Comput. Biol..

[19]  B. Tatler,et al.  The prominence of behavioural biases in eye guidance , 2009 .

[20]  Ralf Engbert,et al.  Microsaccades are triggered by low retinal image slip. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[21]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[22]  H. Müller,et al.  Visual search and selective attention , 2006 .

[23]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[24]  Olivier Le Meur,et al.  A Time-Dependent Saliency Model Combining Center and Depth Biases for 2D and 3D Viewing Conditions , 2012, Cognitive Computation.

[25]  S. Yantis,et al.  Abrupt visual onsets and selective attention: voluntary versus automatic allocation. , 1990, Journal of experimental psychology. Human perception and performance.

[26]  Heiko H. Schütt,et al.  Likelihood-Based Parameter Estimation and Comparison of Dynamical Cognitive Models , 2016, Psychological review.

[27]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[28]  M. Land,et al.  The Roles of Vision and Eye Movements in the Control of Activities of Daily Living , 1998, Perception.

[29]  M. Castelhano,et al.  The relative contribution of scene context and target features to visual search in scenes , 2010, Attention, perception & psychophysics.

[30]  Noel E. O'Connor,et al.  SalGAN: Visual Saliency Prediction with Generative Adversarial Networks , 2017, ArXiv.

[31]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[32]  C. Klein,et al.  Development of prosaccade and antisaccade task performance in participants aged 6 to 26 years. , 2001, Psychophysiology.

[33]  Tom Foulsham,et al.  Turning the world around: Patterns in saccade direction vary with picture orientation , 2008, Vision Research.

[34]  W. Einhäuser,et al.  Attention in natural scenes: Affective-motivational factors guide gaze independently of visual salience , 2017, Vision Research.

[35]  Frédo Durand,et al.  A Benchmark of Computational Models of Saliency to Predict Human Fixations , 2012 .

[36]  R. Baddeley,et al.  Do we look at lights? Using mixture modelling to distinguish between low- and high-level factors in natural image viewing , 2009 .

[37]  Heiko H Schütt,et al.  An image-computable psychophysical spatial vision model. , 2017, Journal of vision.

[38]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[39]  J. Henderson,et al.  Initial scene representations facilitate eye movement guidance in visual search. , 2007, Journal of experimental psychology. Human perception and performance.

[40]  Nicola C. Anderson,et al.  It depends on when you look at it: Salience influences eye movements in natural scene viewing and search early in time. , 2015, Journal of vision.

[41]  Nikolaus Kriegeskorte,et al.  Deep neural networks: a new framework for modelling biological vision and brain information processing , 2015, bioRxiv.

[42]  Matthias Bethge,et al.  Saliency Benchmarking: Separating Models, Maps and Metrics , 2017, ArXiv.

[43]  Iain D. Gilchrist,et al.  Visual correlates of fixation selection: effects of scale and time , 2005, Vision Research.

[44]  James R. Brockmole,et al.  LATEST: A Model of Saccadic Decisions in Space and Time , 2017, Psychological review.

[45]  C. Koch,et al.  Evidence for two distinct mechanisms directing gaze in natural scenes. , 2012, Journal of vision.

[46]  A. L. Yarbus Eye Movements During Perception of Complex Objects , 1967 .

[47]  Qi Zhao,et al.  SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[48]  Michael L. Mack,et al.  Viewing task influences eye movement control during active scene perception. , 2009, Journal of vision.

[49]  Ralf Engbert,et al.  Searchers adjust their eye-movement dynamics to target characteristics in natural scenes , 2019, Scientific Reports.

[50]  C. Koch,et al.  Task-demands can immediately reverse the effects of sensory-driven saliency in complex visual stimuli. , 2008, Journal of vision.

[51]  Jiri Najemnik,et al.  Eye movement statistics in humans are consistent with an optimal search strategy. , 2008, Journal of vision.

[52]  Michael L. Mack,et al.  VISUAL SALIENCY DOES NOT ACCOUNT FOR EYE MOVEMENTS DURING VISUAL SEARCH IN REAL-WORLD SCENES , 2007 .

[53]  Ali Borji,et al.  State-of-the-Art in Visual Attention Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[55]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[56]  Marcus Nyström,et al.  An adaptive algorithm for fixation, saccade, and glissade detection in eyetracking data , 2010, Behavior research methods.

[57]  Zhi Liu,et al.  Saccadic model of eye movements for free-viewing condition , 2015, Vision Research.

[58]  D. Munoz,et al.  Look away: the anti-saccade task and the voluntary control of eye movement , 2004, Nature Reviews Neuroscience.

[59]  John M Henderson,et al.  Stable individual differences across images in human saccadic eye movements. , 2008, Canadian journal of experimental psychology = Revue canadienne de psychologie experimentale.

[60]  R. Venkatesh Babu,et al.  DeepFix: A Fully Convolutional Neural Network for Predicting Human Eye Fixations , 2015, IEEE Transactions on Image Processing.

[61]  Ralf Engbert,et al.  Microsaccades uncover the orientation of covert attention , 2003, Vision Research.

[62]  S. Yantis,et al.  Uniqueness of abrupt visual onset in capturing attention , 1988, Perception & psychophysics.

[63]  Ralf Engbert,et al.  Temporal evolution of the central fixation bias in scene viewing. , 2016, Journal of vision.

[64]  Simon Barthelmé,et al.  Modeling fixation locations using spatial point processes. , 2012, Journal of vision.

[65]  Benjamin W Tatler,et al.  The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. , 2007, Journal of vision.

[66]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[67]  P. König,et al.  The Contributions of Image Content and Behavioral Relevancy to Overt Attention , 2014, PloS one.

[68]  B. Tatler,et al.  Deriving an appropriate baseline for describing fixation behaviour , 2014, Vision Research.

[69]  Michael D. Dodd,et al.  Examining the influence of task set on eye movements and fixations. , 2011, Journal of vision.

[70]  P. Perona,et al.  Objects predict fixations better than early saliency. , 2008, Journal of vision.

[71]  D. Ballard,et al.  Eye guidance in natural vision: reinterpreting salience. , 2011, Journal of vision.

[72]  Nicola C. Anderson,et al.  The influence of a scene preview on eye movement behavior in natural scenes , 2016, Psychonomic bulletin & review.

[73]  R. C. Langford How People Look at Pictures, A Study of the Psychology of Perception in Art. , 1936 .

[74]  Amelia R Hunt,et al.  The saccadic flow baseline: Accounting for image-independent biases in fixation behavior. , 2017, Journal of vision.

[75]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[76]  Matthias Bethge,et al.  DeepGaze II: Reading fixations from deep features trained on object recognition , 2016, ArXiv.

[77]  T. Foulsham,et al.  What can saliency models predict about eye movements? Spatial and sequential aspects of fixations during encoding and recognition. , 2008, Journal of vision.

[78]  Shuo Wang,et al.  Predicting human gaze beyond pixels. , 2014, Journal of vision.

[79]  W. Einhäuser,et al.  Overt attention in natural scenes: Objects dominate features , 2015, Vision Research.

[80]  Xin Chen,et al.  Real-world visual search is dominated by top-down guidance , 2006, Vision Research.

[81]  I. Rentschler,et al.  Peripheral vision and pattern recognition: a review. , 2011, Journal of vision.

[82]  Benjamin W. Tatler,et al.  Systematic tendencies in scene viewing , 2008 .

[83]  Simon Barthelmé,et al.  Spatial statistics and attentional dynamics in scene viewing. , 2014, Journal of vision.