Interesting objects are visually salient.

How do we decide which objects in a visual scene are more interesting? While intuition may point toward high-level object recognition and cognitive processes, here we investigate the contributions of a much simpler process, low-level visual saliency. We used the LabelMe database (24,863 photographs with 74,454 manually outlined objects) to evaluate how often interesting objects were among the few most salient locations predicted by a computational model of bottom-up attention. In 43% of all images the model's predicted most salient location falls within a labeled region (chance 21%). Furthermore, in 76% of the images (chance 43%), one or more of the top three salient locations fell on an outlined object, with performance leveling off after six predicted locations. The bottom-up attention model has neither notion of object nor notion of semantic relevance. Hence, our results indicate that selecting interesting objects in a scene is largely constrained by low-level visual properties rather than solely determined by higher cognitive processes.

[1]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[2]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[3]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[4]  A. Treisman,et al.  Perception of objects in natural scenes: is it really attention free? , 2005, Journal of experimental psychology. Human perception and performance.

[5]  I. Rock,et al.  Inattentional blindness: Perception without attention. , 1998 .

[6]  A. Treisman,et al.  Conjunction search revisited. , 1990, Journal of experimental psychology. Human perception and performance.

[7]  D. Kahneman,et al.  The reviewing of object files: Object-specific integration of information , 1992, Cognitive Psychology.

[8]  L. Itti,et al.  Modeling the influence of task on attention , 2005, Vision Research.

[9]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[10]  Susan L. Franzel,et al.  Guided search: an alternative to the feature integration model for visual search. , 1989, Journal of experimental psychology. Human perception and performance.

[11]  R. Desimone,et al.  Neural mechanisms of selective visual attention. , 1995, Annual review of neuroscience.

[12]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[13]  M. Turatto,et al.  Attentional capture by color without any relevant attentional set , 2001, Perception & psychophysics.

[14]  J. Henderson,et al.  Eye movements and visual memory: Detecting changes to saccade targets in scenes , 2003, Perception & psychophysics.

[15]  J. Wolfe,et al.  What attributes guide the deployment of visual attention and how do they do it? , 2004, Nature Reviews Neuroscience.

[16]  Iain D. Gilchrist,et al.  Visual correlates of fixation selection: effects of scale and time , 2005, Vision Research.

[17]  J. Henderson Human gaze control during real-world scene perception , 2003, Trends in Cognitive Sciences.

[18]  Jianfeng Feng,et al.  Cue-guided search: a computational model of selective attention , 2005, IEEE Transactions on Neural Networks.

[19]  J. Theeuwes Endogenous and Exogenous Control of Visual Selection , 1994, Perception.

[20]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[21]  T. Foulsham,et al.  Quarterly Journal of Experimental Psychology: in press Visual saliency and semantic incongruency influence eye movements when , 2022 .

[22]  J. Theeuwes Abrupt luminance change pops out; abrupt color change does not , 1995, Perception & psychophysics.

[23]  Welch Bl THE GENERALIZATION OF ‘STUDENT'S’ PROBLEM WHEN SEVERAL DIFFERENT POPULATION VARLANCES ARE INVOLVED , 1947 .

[24]  C. Frith,et al.  Inattentional blindness versus inattentional amnesia for fixated but ignored words. , 1999, Science.

[25]  Derrick J. Parkhurst,et al.  Modeling the role of salience in the allocation of overt visual attention , 2002, Vision Research.

[26]  N. Schaumberger Generalization , 1989, Whitehead and Philosophy of Education.

[27]  Robert J Snowden,et al.  Visual Attention to Color: Parvocellular Guidance of Attentional Resources? , 2002, Psychological science.

[28]  A. Oliva,et al.  Coarse Blobs or Fine Edges? Evidence That Information Diagnosticity Changes the Perception of Complex Visual Stimuli , 1997, Cognitive Psychology.

[29]  L. Itti Author address: , 1999 .

[30]  Hermann Schmitt,et al.  Dynamic Representation , 1995, American Political Science Review.

[31]  D. Spalding The Principles of Psychology , 1873, Nature.

[32]  J Jonides,et al.  Attentional capture by abrupt onsets: new perceptual objects or visual masking? , 1996, Journal of experimental psychology. Human perception and performance.

[33]  A. Treisman,et al.  Conjunction search revisited , 1990 .

[34]  Charles L. Folk,et al.  Do locally defined feature discontinuities capture attention? , 1994, Perception & psychophysics.

[35]  Ronald A. Rensink,et al.  Change-blindness as a result of ‘mudsplashes’ , 1999, Nature.

[36]  D. Simons,et al.  Do New Objects Capture Attention? , 2005, Psychological science.

[37]  C. Chabris,et al.  Gorillas in Our Midst: Sustained Inattentional Blindness for Dynamic Events , 1999, Perception.

[38]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[39]  D. Simons,et al.  Moving and looming stimuli capture attention , 2003, Perception & psychophysics.

[40]  Ronald A. Rensink The Dynamic Representation of Scenes , 2000 .

[41]  G W Humphreys,et al.  Visual search for targets defined by combinations of color, shape, and size: An examination of the task constraints on feature and conjunction searches , 1987, Perception & psychophysics.

[42]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[43]  S. Yantis,et al.  Uniqueness of abrupt visual onset in capturing attention , 1988, Perception & psychophysics.