The Interplay of Bottom-Up and Top-Down Mechanisms in Visual Guidance during Object Naming

An ongoing issue in visual cognition concerns the roles played by low- and high-level information in guiding visual attention, with current research remaining inconclusive about the interaction between the two. In this study, we bring fresh evidence into this long-standing debate by investigating visual saliency and contextual congruency during object naming (Experiment 1), a task in which visual processing interacts with language processing. We then compare the results of this experiment to data of a memorization task using the same stimuli (Experiment 2). In Experiment 1, we find that both saliency and congruency influence visual and naming responses and interact with linguistic factors. In particular, incongruent objects are fixated later and less often than congruent ones. However, saliency is a significant predictor of object naming, with salient objects being named earlier in a trial. Furthermore, the saliency and congruency of a named object interact with the lexical frequency of the associated word and mediate the time-course of fixations at naming. In Experiment 2, we find a similar overall pattern in the eye-movement responses, but only the congruency of the target is a significant predictor, with incongruent targets fixated less often than congruent targets. Crucially, this finding contrasts with claims in the literature that incongruent objects are more informative than congruent objects by deviating from scene context and hence need a longer processing. Overall, this study suggests that different sources of information are interactively used to guide visual attention on the targets to be named and raises new questions for existing theories of visual attention.

[1]  G. Zelinsky,et al.  An effect of referential scene constraint on search implies scene segmentation , 2009 .

[2]  Geoffrey M. Underwood,et al.  Cognitive Processes in Eye Guidance: Algorithms for Attention in Image Processing , 2009, Cognitive Computation.

[3]  C. Koch,et al.  Task-demands can immediately reverse the effects of sensory-driven saliency in complex visual stimuli. , 2008, Journal of vision.

[4]  Michael L. Mack,et al.  VISUAL SALIENCY DOES NOT ACCOUNT FOR EYE MOVEMENTS DURING VISUAL SEARCH IN REAL-WORLD SCENES , 2007 .

[5]  Geoffrey M. Underwood,et al.  If Visual Saliency Predicts Search, Then Why? Evidence from Normal and Gaze-Contingent Search Tasks in Natural Scenes , 2011, Cognitive Computation.

[6]  L. Itti,et al.  Mechanisms of top-down attention , 2011, Trends in Neurosciences.

[7]  L. Gleitman,et al.  On the give and take between event apprehension and utterance formulation. , 2007, Journal of memory and language.

[8]  J. Henderson,et al.  Object-based attentional selection in scene viewing. , 2010, Journal of vision.

[9]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[10]  R. D. Gordon,et al.  Attention to smoking-related and incongruous objects during scene viewing. , 2008, Acta psychologica.

[11]  Krista A. Ehinger,et al.  Modelling search for people in 900 scenes: A combined source model of eye guidance , 2009 .

[12]  P. de Graef,et al.  Perceptual effects of scene context on object identification , 1990, Psychological research.

[13]  Laurent Itti,et al.  Interesting objects are visually salient. , 2008, Journal of vision.

[14]  Christof Koch,et al.  Modeling attention to salient proto-objects , 2006, Neural Networks.

[15]  M. Castelhano,et al.  The relative contribution of scene context and target features to visual search in scenes , 2010, Attention, perception & psychophysics.

[16]  Karl J. Friston The free-energy principle: a unified brain theory? , 2010, Nature Reviews Neuroscience.

[17]  Antje S. Meyer,et al.  The time course of lexical access in speech production: A study of picture naming , 1991 .

[18]  Vision Research , 1961, Nature.

[19]  W. Levelt,et al.  Effects of semantic context in the naming of pictures and words , 2001, Cognition.

[20]  G. Underwood,et al.  Congruency, saliency and gist in the inspection of objects in natural scenes , 2007 .

[21]  Gregory J. Zelinsky,et al.  Visual search is guided to categorically-defined targets , 2009, Vision Research.

[22]  J. G. Snodgrass,et al.  A standardized set of 260 pictures: norms for name agreement, image agreement, familiarity, and visual complexity. , 1980, Journal of experimental psychology. Human learning and memory.

[23]  Dave Bartram,et al.  The role of visual and semantic codes in object naming , 1974 .

[24]  W. Levelt,et al.  Viewing and naming objects: eye movements during noun phrase production , 1998, Cognition.

[25]  Andrew Hollingworth,et al.  Journal of Experimental Psychology : Human Perception and Performance The Nesting of Search Contexts Within Natural Scenes : Evidence From Contextual Cuing , 2010 .

[26]  George L. Malcolm,et al.  Combining top-down processes to guide eye movements during real-world scene search. , 2010, Journal of vision.

[27]  Gregory J. Zelinsky,et al.  Synchronizing Visual and Language Processing: An Effect of Object Name Length on Eye Movements , 2000, Psychological science.

[28]  T. Foulsham,et al.  Quarterly Journal of Experimental Psychology: in press Visual saliency and semantic incongruency influence eye movements when , 2022 .

[29]  D. Ballard,et al.  Eye movements in natural behavior , 2005, Trends in Cognitive Sciences.

[30]  D. Mirman,et al.  Statistical and computational models of the visual world paradigm: Growth curves and individual differences. , 2008, Journal of memory and language.

[31]  M. Potter,et al.  Pictures in sentences: understanding without words. , 1986, Journal of experimental psychology. General.

[32]  T. Foulsham,et al.  Is attention necessary for object identification? Evidence from eye movements during the inspection of real-world scenes , 2008, Consciousness and Cognition.

[33]  G. V. Simpson,et al.  Flow of activation from V1 to frontal cortex in humans , 2001, Experimental Brain Research.

[34]  Benjamin W. Tatler,et al.  Systematic tendencies in scene viewing , 2008 .

[35]  P. Perona,et al.  Rapid natural scene categorization in the near absence of attention , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[36]  S. Metlapally,et al.  The effect of positive lens defocus on ocular growth and emmetropization in the tree shrew. , 2008, Journal of vision.

[37]  Geoffrey E. Hinton Learning multiple layers of representation , 2007, Trends in Cognitive Sciences.

[38]  Katie L. McMahon,et al.  Semantic Context and Visual Feature Effects in Object Naming: An fMRI Study using Arterial Spin Labeling , 2009, Journal of Cognitive Neuroscience.

[39]  Moreno I. Coco,et al.  The Impact of Visual Information on Reference Assignment in Sentence Production , 2009 .

[40]  Linda Lundström,et al.  The pupils and optical systems of gecko eyes. , 2009, Journal of vision.

[41]  J. Enns,et al.  What's next? New evidence for prediction in human vision , 2008, Trends in Cognitive Sciences.

[42]  Zenzi M. Griffin,et al.  Constraint, Word Frequency, and the Relationship between Lexical Processing Levels in Spoken Word Production , 1998 .

[43]  J. Henderson,et al.  Object–scene inconsistencies do not capture gaze: evidence from the flash-preview moving-window paradigm , 2011, Attention, perception & psychophysics.

[44]  Miguel P Eckstein,et al.  Object co-occurrence serves as a contextual cue to guide and facilitate visual search in a natural viewing environment. , 2011, Journal of vision.

[45]  Rajesh P. N. Rao,et al.  Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. , 1999 .

[46]  M. Potter Meaning in visual search. , 1975, Science.

[47]  Miguel P Eckstein,et al.  Attentional Cues in Real Scenes, Saccadic Targeting, and Bayesian Priors , 2005, Psychological science.

[48]  R. Baayen,et al.  Mixed-effects modeling with crossed random effects for subjects and items , 2008 .

[49]  Jodi L. Davenport,et al.  Scene Consistency in Object and Background Perception , 2004, Psychological science.

[50]  Benjamin W Tatler,et al.  The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. , 2007, Journal of vision.

[51]  N. Mackworth,et al.  The gaze selects informative details within pictures , 1967 .

[52]  J. Henderson,et al.  The effects of semantic consistency on eye movements during complex scene viewing , 1999 .

[53]  George L. Malcolm,et al.  Searching in the dark: Cognitive relevance drives attention in real-world scenes , 2009, Psychonomic bulletin & review.

[54]  Alex D. Hwang,et al.  Semantic guidance of eye movements in real-world scenes , 2011, Vision Research.

[55]  G. Altmann,et al.  Word meaning and the control of eye fixation: semantic competitor effects and the visual world paradigm , 2005, Cognition.

[56]  N. Mackworth,et al.  Cognitive determinants of fixation location during picture viewing. , 1978, Journal of experimental psychology. Human perception and performance.

[57]  A. Clark Whatever next? Predictive brains, situated agents, and the future of cognitive science. , 2013, The Behavioral and brain sciences.

[58]  A. Treisman,et al.  Perception of objects in natural scenes: is it really attention free? , 2005, Journal of experimental psychology. Human perception and performance.

[59]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[60]  A. Caramazza,et al.  The locus of the frequency effect in picture naming: When recognizing is not enough , 2007, Psychonomic bulletin & review.

[61]  Derrick J. Parkhurst,et al.  Modeling the role of salience in the allocation of overt visual attention , 2002, Vision Research.

[62]  V Di Lollo,et al.  The preattentive emperor has no clothes: a dynamic redressing. , 2001, Journal of experimental psychology. General.

[63]  J. Henderson,et al.  Does gravity matter? Effects of semantic and syntactic inconsistencies on the allocation of attention during scene perception. , 2009, Journal of vision.

[64]  Moreno I. Coco,et al.  A Bayesian Model of the Effect of Object Context on Visual Attention , 2012, CogSci.

[65]  Ali Borji,et al.  Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study , 2013, IEEE Transactions on Image Processing.

[66]  Daniel M. Oppenheimer,et al.  Speakers gaze at objects while preparing intentionally inaccurate labels for them. , 2006, Journal of experimental psychology. Learning, memory, and cognition.

[67]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[68]  D. Ballard,et al.  Eye guidance in natural vision: reinterpreting salience. , 2011, Journal of vision.

[69]  George L. Malcolm,et al.  The effects of target template specificity on visual search in real-world scenes: evidence from eye movements. , 2009, Journal of vision.

[70]  F. Hamker,et al.  About the influence of post-saccadic mechanisms for visual stability on peri-saccadic compression of object location. , 2008, Journal of vision.

[71]  D. Barr Analyzing ‘visual world’ eyetracking data using multilevel logistic regression , 2008 .

[72]  Daniel A. Gajewski,et al.  Minimal use of working memory in a scene comparison task , 2005 .