A Bayesian inference theory of attention: neuroscience and algorithms

AbstractThe past four decades of research in visual neuroscience has generated a large and disparate body of literature onthe role of attention [Itti et al., 2005]. Although several models have been developed to describe specic propertiesof attention, a theoretical framework that explains the computational role of attention and is consistent with all knowneffects is still needed. Recently, several authors have suggested that visual perception can be interpreted as a Bayesianinference process [Rao et al., 2002, Knill and Richards, 1996, Lee and Mumford, 2003]. Within this framework, top-down priors via cortical feedback help disambiguate noisy bottom-up sensory input signals. Building on earlier workby Rao [2005], we show that this Bayesian inference proposal can be extended to explain the role and predict the mainproperties of attention: namely to facilitate the recognition of objects in clutter. Visual recognition proceeds by estimatingthe posterior probabilities for objects and their locations within an image via an exchange of messages between ventraland parietal areas of the visual cortex. Within this framework, spatial attention is used to reduce the uncertainty in featureinformation; feature-basedattentionisusedtoreducetheuncertaintyinlocationinformation. Inconjunction,theyareusedto recognize objects in clutter. Here, we nd that several key attentional phenomena such such as pop-out, multiplicativemodulation and change in contrast response emerge naturally as a property of the network. We explain the idea in threestages. We start with developing a simplied model of attention in the brain identifying the primary areas involved andtheir interconnections. Secondly, we propose a Bayesian network where each node has direct neural correlates within oursimplied biological model. Finally, we elucidate the properties of the resulting model, showing that the predictions areconsistent with physiological and behavioral evidence.

[1]  Krista A. Ehinger,et al.  Modelling search for people in 900 scenes: A combined source model of eye guidance , 2009 .

[2]  D. Heeger,et al.  The Normalization Model of Attention , 2009, Neuron.

[3]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[4]  Shimon Ullman,et al.  Image interpretation by a single bottom-up top-down cycle , 2008, Proceedings of the National Academy of Sciences.

[5]  Tim K Marks,et al.  SUN: A Bayesian framework for saliency using natural statistics. , 2008, Journal of vision.

[6]  Nuno Vasconcelos,et al.  Bottom-up saliency is a discriminant process , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[7]  Tomaso Poggio,et al.  Trade-Off between Object Selectivity and Tolerance in Monkey Inferotemporal Cortex , 2007, The Journal of Neuroscience.

[8]  T. Poggio,et al.  A model of V4 shape selectivity and invariance. , 2007, Journal of neurophysiology.

[9]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Thomas Serre,et al.  A feedforward architecture accounts for rapid categorization , 2007, Proceedings of the National Academy of Sciences.

[11]  E. Miller,et al.  Top-Down Versus Bottom-Up Control of Attention in the Prefrontal and Posterior Parietal Cortices , 2007, Science.

[12]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Jeremy M. Wolfe,et al.  Guided Search 4.0: Current Progress With a Model of Visual Search , 2007, Integrated Models of Cognitive Systems.

[14]  John F. Kalaska,et al.  Computational neuroscience : theoretical insights into brain function , 2007 .

[15]  Christof Koch,et al.  Attention in hierarchical models of object recognition. , 2007, Progress in brain research.

[16]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[17]  L. Zhaoping,et al.  A theory of a saliency map in primary visual cortex (V1) tested by psychophysics of colour–orientation interference in texture segmentation , 2006 .

[18]  Laurent Itti,et al.  An Integrated Model of Top-Down and Bottom-Up Attention for Optimizing Detection Speed , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  Thomas Serre,et al.  A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex , 2005 .

[20]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[21]  John K. Tsotsos,et al.  Neurobiology of Attention , 2005 .

[22]  Rajesh P. N. Rao,et al.  Bayesian Inference and Attentional Modulation in the Visual Cortex Correspondence and Requests for Reprints to Rajesh , 2005 .

[23]  Tomaso Poggio,et al.  Fast Readout of Object Identity from Macaque Inferior Temporal Cortex , 2005, Science.

[24]  Nuno Vasconcelos,et al.  Integrated learning of saliency, complex features, and object detectors from cluttered scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[25]  Robert Desimone,et al.  Parallel and Serial Neural Mechanisms for Visual Search in Macaque Area V4 , 2005, Science.

[26]  Yuanzhen Li,et al.  Feature congestion: a measure of display clutter , 2005, CHI.

[27]  Peter Dayan,et al.  Inference, Attention, and Decision in a Bayesian Neural Architecture , 2004, NIPS.

[28]  F. Fleuret Fast Binary Feature Selection with Conditional Mutual Information , 2004, J. Mach. Learn. Res..

[29]  Minami Ito,et al.  Representation of Angles Embedded within Contour Stimuli in Area V2 of Macaque Monkeys , 2004, The Journal of Neuroscience.

[30]  E. Rolls,et al.  A Neurodynamical cortical model of visual attention and invariant object recognition , 2004, Vision Research.

[31]  Kunihiko Fukushima,et al.  A neural network model for selective attention in visual pattern recognition , 1986, Biological Cybernetics.

[32]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[33]  Antonio Torralba,et al.  Top-down control of visual attention in object detection , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[34]  Jay Hegdé,et al.  How Selective Are V1 Cells for Pop-Out Stimuli? , 2003, The Journal of Neuroscience.

[35]  Y. Amit,et al.  An integrated network for invariant visual detection and recognition , 2003, Vision Research.

[36]  Antonio Torralba,et al.  Modeling global scene factors in attention. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[37]  Tai Sing Lee,et al.  Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[38]  Katherine M. Armstrong,et al.  Selective gating of visual signals by microstimulation of frontal cortex , 2003, Nature.

[39]  M. Goldberg,et al.  Neuronal Activity in the Lateral Intraparietal Area and Spatial Attention , 2003, Science.

[40]  S. Hochstein,et al.  View from the Top Hierarchies and Reverse Hierarchies in the Visual System , 2002, Neuron.

[41]  Thomas Serre,et al.  On the Role of Object-Specific Features for Real World Object Recognition in Biological Vision , 2002, Biologically Motivated Computer Vision.

[42]  Simon J. Thorpe,et al.  Ultra-Rapid Scene Categorization with a Wave of Spikes , 2002, Biologically Motivated Computer Vision.

[43]  S. Treue,et al.  Attentional Modulation Strength in Cortical Area MT Depends on Stimulus Contrast , 2002, Neuron.

[44]  Michel Vidal-Naquet,et al.  Visual features of intermediate complexity and their use in classification , 2002, Nature Neuroscience.

[45]  Rajesh P. N. Rao,et al.  Probabilistic Models of the Brain: Perception and Neural Function , 2002 .

[46]  C. Connor,et al.  Shape representation in area V4: position-specific tuning for boundary conformation. , 2001, Journal of neurophysiology.

[47]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[48]  F. Velde,et al.  From Knowing What to Knowing Where: Modeling Object-Based Attention with Feedback Disinhibition of Activation , 2001 .

[49]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[50]  S. Thorpe,et al.  A Limit to the Speed of Processing in Ultra-Rapid Visual Categorization of Novel Natural Scenes , 2001, Journal of Cognitive Neuroscience.

[51]  F. van der Velde,et al.  From Knowing What to Knowing Where: Modeling Object-Based Attention with Feedback Disinhibition of Activation , 2001, Journal of Cognitive Neuroscience.

[52]  Edmund T. Rolls,et al.  A Model of Invariant Object Recognition in the Visual System: Learning Rules, Activation Functions, Lateral Inhibition, and Information-Based Performance Measures , 2000, Neural Computation.

[53]  Takeo Kanade,et al.  A statistical method for 3D object detection applied to faces and cars , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[54]  R. Desimone,et al.  Attention Increases Sensitivity of V4 Neurons , 2000, Neuron.

[55]  J. Hegdé,et al.  Selectivity for Complex Shapes in Primate Visual Area V2 , 2000, The Journal of Neuroscience.

[56]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[57]  R. Rosenholtz A simple saliency model predicts a number of motion popout phenomena , 1999, Vision Research.

[58]  Stefan Treue,et al.  Feature-based attention influences motion processing gain in macaque visual cortex , 1999, Nature.

[59]  R. Desimone,et al.  Competitive Mechanisms Subserve Attention in Macaque Areas V2 and V4 , 1999, The Journal of Neuroscience.

[60]  Carrie J. McAdams,et al.  Effects of Attention on Orientation-Tuning Functions of Single Neurons in Macaque Cortical Area V4 , 1999, The Journal of Neuroscience.

[61]  M. Goldberg,et al.  Space and attention in parietal cortex. , 1999, Annual review of neuroscience.

[62]  R. Zemel,et al.  Statistical models and sensory attention , 1999 .

[63]  R. Desimone Visual attention mediated by biased competition in extrastriate visual cortex. , 1998, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[64]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[65]  J. Movshon,et al.  Linearity and Normalization in Simple Cells of the Macaque Primary Visual Cortex , 1997, The Journal of Neuroscience.

[66]  John K. Tsotsos Limited Capacity of Any Realizable Perceptual System Is a Sufficient Reason for Attentive Behavior , 1997, Consciousness and Cognition.

[67]  Bartlett W. Mel SEEMORE: Combining Color, Shape, and Texture Histogramming in a Neurally Inspired Approach to Visual Object Recognition , 1997, Neural Computation.

[68]  D. C. Essen,et al.  Neural responses to polar, hyperbolic, and Cartesian gratings in area V4 of the macaque monkey. , 1996, Journal of neurophysiology.

[69]  Keiji Tanaka,et al.  Inferotemporal cortex and object vision. , 1996, Annual review of neuroscience.

[70]  J. Duncan Target and nontarget grouping in visual search , 1995, Perception & psychophysics.

[71]  Leslie G. Ungerleider,et al.  ‘What’ and ‘where’ in the human brain , 1994, Current Opinion in Neurobiology.

[72]  David I. Perrett,et al.  Neurophysiology of shape processing , 1993, Image Vis. Comput..

[73]  Keiji Tanaka,et al.  Coding visual images of objects in the inferotemporal cortex of the macaque monkey. , 1991, Journal of neurophysiology.

[74]  D. J. Felleman,et al.  Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.

[75]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[76]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[77]  R. Desimone,et al.  Visual properties of neurons in area V4 of the macaque: sensitivity to stimulus form. , 1987, Journal of neurophysiology.

[78]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[79]  M. Posner,et al.  Components of visual orienting , 1984 .

[80]  I. Biederman,et al.  Scene perception: Detecting and judging objects undergoing relational violations , 1982, Cognitive Psychology.

[81]  J. Daugman Two-dimensional spectral analysis of cortical receptive field profiles , 1980, Vision Research.

[82]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[83]  D. Hubel,et al.  Receptive fields and functional architecture of monkey striate cortex , 1968, The Journal of physiology.

[84]  A. L. I︠A︡rbus Eye Movements and Vision , 1967 .

[85]  A. L. Yarbus,et al.  Eye Movements and Vision , 1967, Springer US.

[86]  D. Hubel,et al.  Receptive fields of single neurones in the cat's striate cortex , 1959, The Journal of physiology.

[87]  J. Deutsch Perception and Communication , 1958, Nature.