Meaning Guides Attention during Real-World Scene Description

Intelligent analysis of a visual scene requires that important regions be prioritized and attentionally selected for preferential processing. What is the basis for this selection? Here we compared the influence of meaning and image salience on attentional guidance in real-world scenes during two free-viewing scene description tasks. Meaning was represented by meaning maps capturing the spatial distribution of semantic features. Image salience was represented by saliency maps capturing the spatial distribution of image features. Both types of maps were coded in a format that could be directly compared to maps of the spatial distribution of attention derived from viewers’ eye fixations in the scene description tasks. The results showed that both meaning and salience predicted the spatial distribution of attention in these tasks, but that when the correlation between meaning and salience was statistically controlled, only meaning accounted for unique variance in attention. The results support theories in which cognitive relevance plays the dominant functional role in controlling human attentional guidance in scenes. The results also have practical implications for current artificial intelligence approaches to labeling real-world images.

[1]  J. Henderson,et al.  The effects of semantic consistency on eye movements during complex scene viewing , 1999 .

[2]  George L. Malcolm,et al.  Searching in the dark: Cognitive relevance drives attention in real-world scenes , 2009, Psychonomic bulletin & review.

[3]  A. L. Yarbus,et al.  Eye Movements and Vision , 1967, Springer US.

[4]  M. Hayhoe Vision and Action. , 2017, Annual review of vision science.

[5]  Laurent Itti,et al.  Interesting objects are visually salient. , 2008, Journal of vision.

[6]  I. Biederman,et al.  Scene perception: Detecting and judging objects undergoing relational violations , 1982, Cognitive Psychology.

[7]  J. Henderson,et al.  High-level scene perception. , 1999, Annual review of psychology.

[8]  Alan L. F. Lee,et al.  A comparison of global motion perception using a multiple-aperture stimulus. , 2010, Journal of vision.

[9]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[10]  J. Henderson,et al.  Does gravity matter? Effects of semantic and syntactic inconsistencies on the allocation of attention during scene perception. , 2009, Journal of vision.

[11]  J. Henderson,et al.  The influence of color on the perception of scene gist. , 2008, Journal of experimental psychology. Human perception and performance.

[12]  Michelle R. Greene Statistics of high-level scene context , 2013, Front. Psychol..

[13]  Gregory J. Zelinsky,et al.  Scene context guides eye movements during visual search , 2006, Vision Research.

[14]  J. Henderson,et al.  Object–scene inconsistencies do not capture gaze: evidence from the flash-preview moving-window paradigm , 2011, Attention, perception & psychophysics.

[15]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[16]  J. Henderson Regarding Scenes , 2007 .

[17]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[18]  N. Mackworth,et al.  Cognitive determinants of fixation location during picture viewing. , 1978, Journal of experimental psychology. Human perception and performance.

[19]  J. Henderson,et al.  Does consistent scene context facilitate object perception? , 1998, Journal of experimental psychology. General.

[20]  P. Perona,et al.  What do we perceive in a glance of a real-world scene? , 2007, Journal of vision.

[21]  Taylor R. Hayes,et al.  Meaning guides attention in real-world scene images: Evidence from eye movements and meaning maps , 2017, bioRxiv.

[22]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[23]  S. Metlapally,et al.  The effect of positive lens defocus on ocular growth and emmetropization in the tree shrew. , 2008, Journal of vision.

[24]  D. S. Wooding,et al.  The relationship between the locations of spatial features and those of fixations made during visual examination of briefly presented images. , 1996, Spatial vision.

[25]  A. Levy-Schoen Eye movements and vision : A. L. Yarbus (translated by L. A. Riggs). Plenum Press, New York, 1967 XI+222 pp. $17,50 , 1968 .

[26]  Fernanda Ferreira,et al.  Scene Perception for Psycholinguists. , 2004 .

[27]  C. Koch,et al.  Task-demands can immediately reverse the effects of sensory-driven saliency in complex visual stimuli. , 2008, Journal of vision.

[28]  Michael L. Mack,et al.  VISUAL SALIENCY DOES NOT ACCOUNT FOR EYE MOVEMENTS DURING VISUAL SEARCH IN REAL-WORLD SCENES , 2007 .

[29]  Derrick J. Parkhurst,et al.  Modeling the role of salience in the allocation of overt visual attention , 2002, Vision Research.

[30]  Zenzi M. Griffin,et al.  What the Eyes Say About Speaking , 2000, Psychological science.

[31]  D. Ballard,et al.  Eye guidance in natural vision: reinterpreting salience. , 2011, Journal of vision.

[32]  M. Hayhoe,et al.  In what ways do eye movements contribute to everyday activities? , 2001, Vision Research.

[33]  Nancy Millette,et al.  How People Look at Pictures , 1935 .

[34]  G. Underwood,et al.  Low-level visual saliency does not predict change detection in natural scenes. , 2007, Journal of vision.

[35]  J. Henderson,et al.  Flashing scenes and moving windows: an effect of initial scene gist on eye movements , 2010 .

[36]  Ali Borji,et al.  State-of-the-Art in Visual Attention Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  I. Biederman Perceiving Real-World Scenes , 1972, Science.

[38]  M. Potter Meaning in visual search. , 1975, Science.

[39]  Taylor R. Hayes,et al.  Meaning-based guidance of attention in scenes as revealed by meaning maps , 2017, Nature Human Behaviour.

[40]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[41]  J. Findlay,et al.  Absence of scene context effects in object detection and eye gaze capture , 2007 .

[42]  Michael F. Land,et al.  From eye movements to actions: how batsmen hit the ball , 2000, Nature Neuroscience.

[43]  Frédo Durand,et al.  What Do Different Evaluation Metrics Tell Us About Saliency Models? , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  D. Ballard,et al.  Modeling Task Control of Eye Movements , 2014, Current Biology.

[45]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[46]  Nicola C. Anderson,et al.  It depends on when you look at it: Salience influences eye movements in natural scene viewing and search early in time. , 2015, Journal of vision.

[47]  John M Henderson,et al.  The time course of initial scene processing for eye movement guidance in natural scene search. , 2010, Journal of vision.

[48]  A. L. I︠A︡rbus Eye Movements and Vision , 1967 .

[49]  K. Rayner,et al.  Eye movements and scene perception. , 1992, Canadian journal of psychology.

[50]  Daniel F. Parks,et al.  Complementary effects of gaze direction and early saliency in guiding fixations during free viewing. , 2014, Journal of vision.

[51]  J H van Hateren,et al.  Simulating human cones from mid-mesopic up to high-photopic luminances. , 2007, Journal of vision.

[52]  Benjamin W Tatler,et al.  The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. , 2007, Journal of vision.

[53]  M. Pomplun,et al.  Guidance of visual attention by semantic information in real-world scenes , 2014, Front. Psychol..

[54]  J. Henderson Human gaze control during real-world scene perception , 2003, Trends in Cognitive Sciences.

[55]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[56]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[57]  F. Ferreira,et al.  How incremental is language production? Evidence from the production of utterances requiring the computation of arithmetic sums , 2002 .

[58]  Ali Borji,et al.  Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study , 2013, IEEE Transactions on Image Processing.

[59]  J. Henderson Gaze Control as Prediction , 2017, Trends in Cognitive Sciences.

[60]  D. Ballard,et al.  Eye movements in natural behavior , 2005, Trends in Cognitive Sciences.

[61]  Jochen Braun,et al.  Contrast thresholds for component motion with full and poor attention. , 2007, Journal of vision.

[62]  M. Tinker How People Look at Pictures. , 1936 .

[63]  Nienke Meulman,et al.  An ERP study on L2 syntax processing: When do learners fail? , 2014, Front. Psychol..

[64]  J. Wolfe,et al.  Five factors that guide attention in visual search , 2017, Nature Human Behaviour.

[65]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[66]  S. Fleming,et al.  Explicit representation of confidence informs future value-based decisions , 2016, Nature Human Behaviour.

[67]  K. Rayner The 35th Sir Frederick Bartlett Lecture: Eye movements and attention in reading, scene perception, and visual search , 2009, Quarterly journal of experimental psychology.

[68]  J. Todd,et al.  The effects of viewing angle, camera angle, and sign of surface curvature on the perception of three-dimensional shape from texture. , 2007, Journal of vision.