A Unied Theory of Attentional Control

Although diverse, theories of visual attention generally share the notion that attention is controlled by some combination of three distinct strategies: (1) exogenous cueing from locallycontrasting primitive visual features, such as abrupt onsets or color singletons (e.g., Itti et al., 1998); (2) endogenous gain modulation of exogenous activations, used to guide attention to task relevant features (e.g. Navalpakkam and Itti, 2005; Wolfe, 1994, 2007); and (3) endogenous prediction of likely locations of interest, based on task and scene gist (e.g., Torralba, Oliva, Castelhano, and Henderson, 2006). We propose a unifying conceptualization in which attention is controlled along two dimensions: the degree of task focus and the spatial scale of operation. Previously proposed strategies—and their combinations—can be viewed as instances of this mechanism. Thus, this theory serves not as a replacement for existing models, but as a means of bringing them into a coherent framework. We implement this theory and demonstrate its applicability to a wide range of attentional phenomena. The model successfully yields the trends found in visual search tasks with synthetic images and makes predictions that correspond well with human eye movement data for tasks involving real-world images. In addition, the theory yields an unusual perspective on attention that places a fundamental emphasis on the role of experience and task-related knowledge.

[1]  J. Wolfe Visual memory: What do you know about what you saw? , 1998, Current Biology.

[2]  Dominique Lamy,et al.  Effects of search mode and intertrial priming on singleton search , 2006, Perception & psychophysics.

[3]  Laurent Itti,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Rapid Biologically-inspired Scene Classification Using Features Shared with Visual Attention , 2022 .

[4]  M. Chun,et al.  Perceptual constraints on implicit learning of spatial context , 2002 .

[5]  Matthew H Tong,et al.  SUN: Top-down saliency using natural statistics , 2009, Visual cognition.

[6]  M. Posner,et al.  Components of visual orienting , 1984 .

[7]  Tim K Marks,et al.  SUN: A Bayesian framework for saliency using natural statistics. , 2008, Journal of vision.

[8]  L. Zhaoping,et al.  A theory of a saliency map in primary visual cortex (V1) tested by psychophysics of colour–orientation interference in texture segmentation , 2006 .

[9]  Antonio Torralba,et al.  Contextual Modulation of Target Saliency , 2001, NIPS.

[10]  I. Biederman Perceiving Real-World Scenes , 1972, Science.

[11]  Tomaso Poggio,et al.  Intracellular measurements of spatial integration and the MAX operation in complex cells of the cat primary visual cortex. , 2004, Journal of neurophysiology.

[12]  Dietmar Heinke,et al.  SAIM: A Model of Visual Attention and Neglect , 1997, ICANN.

[13]  Antonio Torralba,et al.  Modeling global scene factors in attention. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[14]  Christopher D. Wickens,et al.  Attention to Attention and Its Applications: A Concluding View , 2006 .

[15]  Michael C. Mozer,et al.  Experience-Guided Search: A Theory of Attentional Control , 2007, NIPS.

[16]  G. Woodman,et al.  Lower region: a new cue for figure-ground assignment. , 2002, Journal of experimental psychology. General.

[17]  L. Itti,et al.  Modeling the influence of task on attention , 2005, Vision Research.

[18]  Peter Dayan,et al.  Inference, Attention, and Decision in a Bayesian Neural Architecture , 2004, NIPS.

[19]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[20]  H. Egeth,et al.  Overriding stimulus-driven attentional capture , 1994, Perception & psychophysics.

[21]  James R. Brockmole,et al.  Contextual cueing in naturalistic scenes: Global and local contexts. , 2006, Journal of experimental psychology. Learning, memory, and cognition.

[22]  A. Treisman Perceptual grouping and attention in visual search for features and for objects. , 1982, Journal of experimental psychology. Human perception and performance.

[23]  William T. Freeman,et al.  Presented at: 2nd Annual IEEE International Conference on Image , 1995 .

[24]  Michael C. Mozer,et al.  Perception of multiple objects - a connectionist approach , 1991, Neural network modeling and connectionism.

[25]  John M Henderson,et al.  The Role of Meaning in Contextual Cueing: Evidence from Chess Expertise , 2008, Quarterly journal of experimental psychology.

[26]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[27]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[28]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[29]  Rajesh P. N. Rao,et al.  Eye movements in iconic visual search , 2002, Vision Research.

[30]  Christof Koch,et al.  Predicting human gaze using low-level saliency combined with face detection , 2007, NIPS.

[31]  M. Behrmann,et al.  Spatial probability as an attentional cue in visual search , 2005, Perception & psychophysics.

[32]  M. Chun,et al.  Contextual Cueing: Implicit Learning and Memory of Visual Context Guides Spatial Attention , 1998, Cognitive Psychology.

[33]  T. Gawne,et al.  Responses of primate visual cortical V4 neurons to simultaneously presented stimuli. , 2002, Journal of neurophysiology.

[34]  Jeremy M. Wolfe,et al.  Guided Search 4.0: Current Progress With a Model of Visual Search , 2007, Integrated Models of Cognitive Systems.

[35]  G. Humphreys,et al.  Attention, spatial representation, and visual neglect: simulating emergent attention and spatial memory in the selective attention for identification model (SAIM). , 2003, Psychological review.

[36]  Mary A. Peterson,et al.  Memory and Learning in Figure-Ground Perception , 2003 .

[37]  Pierre Baldi,et al.  Bayesian surprise attracts human attention , 2005, Vision Research.

[38]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[39]  J. C. Johnston,et al.  Involuntary covert orienting is contingent on attentional control settings. , 1992, Journal of experimental psychology. Human perception and performance.

[40]  Michael C. Mozer,et al.  Top-Down Control of Visual Attention: A Rational Account , 2005, NIPS.

[41]  Christof Koch,et al.  Using semantic content as cues for better scanpath prediction , 2008, ETRA.

[42]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[43]  Gregory J. Zelinsky,et al.  Scene context guides eye movements during visual search , 2006, Vision Research.

[44]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[45]  J. Wolfe,et al.  What attributes guide the deployment of visual attention and how do they do it? , 2004, Nature Reviews Neuroscience.

[46]  Nuno Vasconcelos,et al.  On the plausibility of the discriminant center-surround hypothesis for visual saliency. , 2008, Journal of vision.

[47]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[48]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[49]  Susan L. Franzel,et al.  Guided search: an alternative to the feature integration model for visual search. , 1989, Journal of experimental psychology. Human perception and performance.

[50]  John K. Tsotsos,et al.  Saliency, attention, and visual search: an information theoretic approach. , 2009, Journal of vision.

[51]  E. Averbach,et al.  Short-term memory in vision , 1961 .