Attention in hierarchical models of object recognition.

Object recognition and visual attention are tightly linked processes in human perception. Over the last three decades, many models have been suggested to explain these two processes and their interactions, and in some cases these models appear to contradict each other. We suggest a unifying framework for object recognition and attention and review the existing modeling literature in this context. Furthermore, we demonstrate a proof-of-concept implementation for sharing complex features between recognition and attention as a mode of top-down attention to particular objects or object categories.

[1]  Michael C. Mozer,et al.  Computational modeling of spatial attention , 1996 .

[2]  Gerhard Krieger,et al.  Scene analysis with saccadic eye movements: Top-down and bottom-up modeling , 2001, J. Electronic Imaging.

[3]  John K. Tsotsos,et al.  Attention links sensing to recognition , 2008, Image Vis. Comput..

[4]  C. Eriksen,et al.  Visual attention within and around the field of focal attention: A zoom lens model , 1986, Perception & psychophysics.

[5]  P. Perona,et al.  What do we perceive in a glance of a real-world scene? , 2007, Journal of vision.

[6]  Nancy Kanwisher,et al.  fMRI evidence for objects as the units of attentional selection , 1999, Nature.

[7]  D. V. van Essen,et al.  Spatial Attention Effects in Macaque Area V4 , 1997, The Journal of Neuroscience.

[8]  Thomas Serre,et al.  A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex , 2005 .

[9]  F. Hamker The reentry hypothesis: linking eye movements to visual perception. , 2003, Journal of vision.

[10]  H H Bülthoff,et al.  How are three-dimensional objects represented in the brain? , 1994, Cerebral cortex.

[11]  Richard S. J. Frackowiak,et al.  Two Modulatory Effects of Attention That Mediate Object Categorization in Human Cortex , 1997, Science.

[12]  L. Itti,et al.  Visual causes versus correlates of attentional selection in dynamic scenes , 2006, Vision Research.

[13]  M. Tarr,et al.  Becoming a “Greeble” Expert: Exploring Mechanisms for Face Recognition , 1997, Vision Research.

[14]  I. Biederman,et al.  Neural evidence for intermediate representations in object recognition , 2006, Vision Research.

[15]  Jude F. Mitchell,et al.  Attentional selection of superimposed surfaces cannot be explained by modulation of the gain of color channels , 2003, Vision Research.

[16]  H. Spitzer,et al.  Increased attention enhances both behavioral and neuronal performance. , 1988, Science.

[17]  S. Grossberg,et al.  Contrast-sensitive perceptual grouping and object-based attention in the laminar circuits of primary visual cortex , 2000, Vision Research.

[18]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[19]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[20]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[21]  K. Cave The FeatureGate model of visual selection , 1999, Psychological research.

[22]  John H. R. Maunsell,et al.  Attention to both space and feature modulates neuronal responses in macaque area V4. , 2000, Journal of neurophysiology.

[23]  Dirk B. Walther,et al.  Task-set switching with natural scenes: measuring the cost of deploying top-down attention. , 2007, Journal of vision.

[24]  Y. Amit,et al.  An integrated network for invariant visual detection and recognition , 2003, Vision Research.

[25]  J Duncan,et al.  Responses of neurons in macaque area V4 during memory-guided visual search. , 2001, Cerebral cortex.

[26]  S. Ullman Object recognition and segmentation by a fragment-based hierarchy , 2007, Trends in Cognitive Sciences.

[27]  Bartlett W. Mel SEEMORE: Combining Color, Shape, and Texture Histogramming in a Neurally Inspired Approach to Visual Object Recognition , 1997, Neural Computation.

[28]  D. V. van Essen,et al.  A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[29]  G. Humphreys,et al.  Attention, spatial representation, and visual neglect: simulating emergent attention and spatial memory in the selective attention for identification model (SAIM). , 2003, Psychological review.

[30]  Asha Iyer,et al.  Components of bottom-up gaze allocation in natural images , 2005, Vision Research.

[31]  I. Biederman,et al.  Dynamic binding in a neural network for shape recognition. , 1992, Psychological review.

[32]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[33]  Gustavo Deco,et al.  Attention in natural scenes: Neurophysiological and computational bases , 2006, Neural Networks.

[34]  Dietmar Heinke,et al.  SAIM: A Model of Visual Attention and Neglect , 1997, ICANN.

[35]  F. Hamker A dynamic model of how feature cues guide spatial attention , 2004, Vision Research.

[36]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[37]  G. Deco,et al.  A hierarchical neural system with attentional top–down enhancement of the spatial resolution for object recognition , 2000, Vision Research.

[38]  S. Ullman The Interpretation of Visual Motion , 1979 .

[39]  S. Grossberg,et al.  Context-sensitive binding by the laminar circuits of V1 and V2: A unified model of perceptual grouping, attention, and orientation contrast , 2001 .

[40]  Christof Koch,et al.  Modeling attention to salient proto-objects , 2006, Neural Networks.

[41]  Ronald A. Rensink Seeing, sensing, and scrutinizing , 2000, Vision Research.

[42]  Pietro Perona,et al.  Is bottom-up attention useful for object recognition? , 2004, CVPR 2004.

[43]  Michael J. Tarr Is human object recognition better described by geon structural description or by multiple views , 1995 .

[44]  J. Duncan Selective attention and the organization of visual information. , 1984, Journal of experimental psychology. General.

[45]  R. Desimone,et al.  Responses of Neurons in Inferior Temporal Cortex during Memory- Guided Visual Search , 1998 .

[46]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[47]  Susan L. Franzel,et al.  Guided search: an alternative to the feature integration model for visual search. , 1989, Journal of experimental psychology. Human perception and performance.

[48]  Pietro Perona,et al.  Selective visual attention enables learning and recognition of multiple objects in cluttered scenes , 2005, Comput. Vis. Image Underst..

[49]  L. Itti,et al.  Modeling the influence of task on attention , 2005, Vision Research.

[50]  Fred Henrik Hamker,et al.  The emergence of attention by population-based inference and its role in distributed processing and cognitive control of vision , 2005, Comput. Vis. Image Underst..

[51]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[52]  PoggioTomaso,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007 .

[53]  Soo-Young Lee,et al.  Top-Down Attention Control at Feature Space for Robust Pattern Recognition , 2000, Biologically Motivated Computer Vision.

[54]  Simone Frintrop,et al.  A Bimodal Laser-Based Attention System , 2005, Comput. Vis. Image Underst..

[55]  Shaul Hochstein,et al.  At first sight: A high-level pop out effect for faces , 2005, Vision Research.

[56]  L. Itti,et al.  Search Goal Tunes Visual Features Optimally , 2007, Neuron.

[57]  L. Itti Author address: , 1999 .

[58]  Michel Vidal-Naquet,et al.  Visual features of intermediate complexity and their use in classification , 2002, Nature Neuroscience.

[59]  Barry D. Vaughan,et al.  Object-Based Visual Selection: Evidence From Perceptual Completion , 1998 .

[60]  E. Rolls,et al.  INVARIANT FACE AND OBJECT RECOGNITION IN THE VISUAL SYSTEM , 1997, Progress in Neurobiology.

[61]  Carrie J. McAdams,et al.  Effects of Attention on Orientation-Tuning Functions of Single Neurons in Macaque Cortical Area V4 , 1999, The Journal of Neuroscience.

[62]  Christof Koch,et al.  Attentional Selection for Object Recognition - A Gentle Way , 2002, Biologically Motivated Computer Vision.

[63]  I. Biederman,et al.  Priming contour-deleted images: Evidence for intermediate representations in visual object recognition , 1991, Cognitive Psychology.

[64]  J J Hopfield,et al.  Neurons with graded response have collective computational properties like those of two-state neurons. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[65]  A. Torralba,et al.  Sharing features: efficient boosting procedures for multiclass object detection , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[66]  David J. Freedman,et al.  A Comparison of Primate Prefrontal and Inferior Temporal Cortices during Visual Categorization , 2003, The Journal of Neuroscience.

[67]  R. Rafal,et al.  Shifting visual attention between objects and locations: evidence from normal and parietal lesion subjects. , 1994, Journal of experimental psychology. General.

[68]  N. Logothetis,et al.  View-dependent object recognition by monkeys , 1994, Current Biology.

[69]  S. Edelman,et al.  Computational Theories of Object Recognition Edelman -computation and Object Recognition Ii Box 1. Structural Descriptions ~ 7~ Recognition by Components Varieties of Alignment Multidimensional Histograms Approximation in Feature Space , 2022 .

[70]  Katsumi Aoki,et al.  Recent development of flow visualization , 2004, J. Vis..

[71]  R. Desimone,et al.  Neural mechanisms of selective visual attention. , 1995, Annual review of neuroscience.

[72]  Michael C. Mozer,et al.  Perception of multiple objects - a connectionist approach , 1991, Neural network modeling and connectionism.

[73]  I. Rybak,et al.  A model of attention-guided visual perception and recognition , 1998, Vision Research.

[74]  E. Rolls,et al.  A Neurodynamical cortical model of visual attention and invariant object recognition , 2004, Vision Research.

[75]  M. Posner,et al.  Orienting of Attention* , 1980, The Quarterly journal of experimental psychology.

[76]  B. Motter Neural correlates of attentive selection for color or luminance in extrastriate area V4 , 1994, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[77]  Thomas Serre,et al.  A quantitative theory of immediate visual recognition. , 2007, Progress in brain research.

[78]  Ronald A. Rensink The Dynamic Representation of Scenes , 2000 .

[79]  John K. Tsotsos,et al.  Attending to visual motion , 2005, Comput. Vis. Image Underst..

[80]  Pieter R. Roelfsema,et al.  Object-based attention in the primary visual cortex of the macaque monkey , 1998, Nature.

[81]  Thierry Pun,et al.  Integration of bottom-up and top-down cues for visual attention using non-linear relaxation , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[82]  David G. Lowe,et al.  Multiclass Object Recognition with Sparse, Localized Features , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[83]  M J Tarr,et al.  Is human object recognition better described by geon structural descriptions or by multiple views? Comment on Biederman and Gerhardstein (1993). , 1995, Journal of experimental psychology. Human perception and performance.

[84]  D. Marr,et al.  Representation and recognition of the spatial organization of three-dimensional shapes , 1978, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[85]  Stefan Treue,et al.  Feature-based attention influences motion processing gain in macaque visual cortex , 1999, Nature.

[86]  M. Bar Visual objects in context , 2004, Nature Reviews Neuroscience.

[87]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[88]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[89]  Thomas Serre,et al.  Object recognition with features inspired by visual cortex , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[90]  Laurent Itti,et al.  Neuromorphic algorithms for computer vision and attention , 2001, SPIE Optics + Photonics.

[91]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[92]  Antonio Torralba,et al.  Top-down control of visual attention in object detection , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[93]  J. Duncan,et al.  Visual search and stimulus similarity. , 1989, Psychological review.

[94]  R. Desimone,et al.  Attention Increases Sensitivity of V4 Neurons , 2000, Neuron.

[95]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[96]  R. VanRullen On second glance: Still no high-level pop-out effect for faces , 2006, Vision Research.

[97]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[98]  J. Reynolds,et al.  Exogenously cued attention triggers competitive selection of surfaces , 2003, Vision Research.

[99]  Thomas Serre,et al.  Modeling feature sharing between object detection and top-down attention , 2005 .

[100]  Trevor Darrell,et al.  Integrated Person Tracking Using Stereo, Color, and Pattern Detection , 2000, International Journal of Computer Vision.

[101]  G. Boynton,et al.  Global effects of feature-based attention in human visual cortex , 2002, Nature Neuroscience.

[102]  R. Desimone,et al.  Neural mechanisms of spatial selective attention in areas V1, V2, and V4 of macaque visual cortex. , 1997, Journal of neurophysiology.

[103]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[104]  T. Poggio,et al.  A network that learns to recognize three-dimensional objects , 1990, Nature.