Modeling global scene factors in attention.

Models of visual attention have focused predominantly on bottom-up approaches that ignored structured contextual and scene information. I propose a model of contextual cueing for attention guidance based onthe global scene configuration. It is shown that the statistics of low-level features across the whole image can be used to prime the presence or absence of objects in the scene and to predict their location, scale, and appearance before exploring the image. In this scheme, visual context information can become available early in the visual processing chain, which allows modulation of the saliency of image regions and provides an efficient shortcut for object detection and recognition.

[1]  U. Neisser VISUAL SEARCH. , 1964, Scientific American.

[2]  A. Oliva,et al.  From Blobs to Boundary Edges: Evidence for Time- and Spatial-Scale-Dependent Scene Recognition , 1994 .

[3]  tephen E. Palmer The effects of contextual scenes on the identification of objects , 1975, Memory & cognition.

[4]  Antonio Torralba,et al.  Statistical Context Priming for Object Detection , 2001, ICCV.

[5]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[6]  Rosalind W. Picard,et al.  Texture orientation for sorting photos "at a glance" , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[7]  M. Chun,et al.  Contextual Cueing: Implicit Learning and Memory of Visual Context Guides Spatial Attention , 1998, Cognitive Psychology.

[8]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[9]  Lawrence W. Stark,et al.  Top-down guided eye movements , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[10]  A. L. Yarbus,et al.  Eye Movements and Vision , 1967, Springer US.

[11]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[12]  I. Biederman,et al.  Scene perception: Detecting and judging objects undergoing relational violations , 1982, Cognitive Psychology.

[13]  Antonio Torralba,et al.  Depth Estimation from Image Structure , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[15]  Antonio Torralba,et al.  Contextual Modulation of Target Saliency , 2001, NIPS.

[16]  M. Potter Meaning in visual search. , 1975, Science.

[17]  W. Richards,et al.  Model structure and reliable inference , 1996 .

[18]  Martin Szummer,et al.  Indoor-outdoor image classification , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[19]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[20]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[21]  Derrick J. Parkhurst,et al.  Modeling the role of salience in the allocation of overt visual attention , 2002, Vision Research.

[22]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[23]  A. Oliva,et al.  Coarse Blobs or Fine Edges? Evidence That Information Diagnosticity Changes the Perception of Complex Visual Stimuli , 1997, Cognitive Psychology.

[24]  A. L. I︠A︡rbus Eye Movements and Vision , 1967 .

[25]  Sayan Mukherjee,et al.  Feature reduction and hierarchy of classifiers for fast object detection in video images , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[26]  Ronald A. Rensink,et al.  TO SEE OR NOT TO SEE: The Need for Attention to Perceive Changes in Scenes , 1997 .

[27]  L. Stark,et al.  Scanpaths in Eye Movements during Pattern Perception , 1971, Science.

[28]  Shimon Ullman,et al.  Structural Saliency: The Detection Of Globally Salient Structures using A Locally Connected Network , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[29]  Neil Gershenfeld,et al.  The nature of mathematical modeling , 1998 .

[30]  Thomas M. Strat,et al.  Context-Based Vision: Recognizing Objects Using Information from Both 2D and 3D Imagery , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  A. Oliva,et al.  Diagnostic Colors Mediate Scene Recognition , 2000, Cognitive Psychology.

[32]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Serge J. Belongie,et al.  Region-based image querying , 1997, 1997 Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries.

[34]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[35]  M P Eckstein,et al.  Visual signal detection in structured backgrounds. I. Effect of number of possible spatial locations and signal contrast. , 1996, Journal of the Optical Society of America. A, Optics, image science, and vision.

[36]  R. Rosenholtz A simple saliency model predicts a number of motion popout phenomena , 1999, Vision Research.

[37]  P. de Graef,et al.  Perceptual effects of scene context on object identification , 1990, Psychological research.

[38]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[39]  Anne Treisman,et al.  Properties, Parts, and Objects , 1986 .

[40]  Michel Vidal-Naquet,et al.  Visual features of intermediate complexity and their use in classification , 2002, Nature Neuroscience.

[41]  Tony Lindeberg,et al.  Detecting salient blob-like image structures and their scales with a scale-space primal sketch: A method for focus-of-attention , 1993, International Journal of Computer Vision.

[42]  W. Epstein,et al.  Priming Spatial Layout of Scenes , 1997 .

[43]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[44]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[45]  Rajesh P. N. Rao,et al.  Modeling Saccadic Targeting in Visual Search , 1995, NIPS.

[46]  S. Edelman,et al.  Computational Theories of Object Recognition Edelman -computation and Object Recognition Ii Box 1. Structural Descriptions ~ 7~ Recognition by Components Varieties of Alignment Multidimensional Histograms Approximation in Feature Space , 2022 .

[47]  J. Henderson,et al.  High-level scene perception. , 1999, Annual review of psychology.

[48]  Bernt Schiele,et al.  Recognition without Correspondence using Multidimensional Receptive Field Histograms , 2004, International Journal of Computer Vision.

[49]  M. Potter,et al.  Recognition memory for a rapid sequence of pictures. , 1969, Journal of experimental psychology.

[50]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[51]  S. Thorpe,et al.  Speed of processing in the human visual system , 1996, Nature.