The What-and-Where Filter: A Spatial Mapping Neural Network for Object Recognition and Image Understanding

The What-and-Where filter forms part of a neural network architecture for spatial mapping, object recognition, and image understanding. The Where filter responds to an image figure that has been separated from its background. It generates a spatial map whose cell activations simultaneously represent the position, orientation, and size of all the figures in a scene (where they are). This spatial map may be used to direct spatially localized attention to these image features. A multiscale array of oriented detectors, followed by competitive and interpolative interactions between position, orientation, and size scales, is used to define the Where filter. This analysis discloses several issues that need to be dealt with by a spatial mapping system that is based upon oriented filters, such as the role of cliff filters with and without normalization, the double peak problem of maximum orientation across size scale, and the different self-similar interpolation properties across orientation than across size scale. Several computationally efficient Where filters are proposed. The Where filter may be used for parallel transformation of multiple image figures into invariant representations that are insensitive to the figures' original position, orientation, and size. These invariant figural representations form part of a system devoted to attentive object learning and recognition (what it is). Unlike some alternative models where serial search for a target occurs, a What and Where representation can be used to rapidly search in parallel for a desired target in a scene. Such a representation can also be used to learn multidimensional representations of objects and their spatial relationships for purposes of image understanding. The What-and-Where filter is inspired by neurobiological data showing that a Where processing stream in the cerebral cortex is used for attentive spatial localization and orientation, whereas a What processing stream is used for attentive object learning and recognition.

[1]  Stephen Grossberg,et al.  Contour Enhancement, Short Term Memory, and Constancies in Reverberating Neural Networks , 1973 .

[2]  Richard Granger,et al.  A cortical model of winner-take-all competition via lateral inhibition , 1992, Neural Networks.

[3]  Bard Ermentrout,et al.  Complex dynamics in winner-take-all neural nets with slow inhibition , 1992, Neural Networks.

[4]  R. Remington,et al.  Moving attention: Evidence for time-invariant shifts of visual selective attention , 1984, Perception & psychophysics.

[5]  R. Desimone,et al.  Activity of neurons in anterior inferior temporal cortex during a short- term memory task , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[6]  Leslie G. Ungerleider,et al.  Object vision and spatial vision: two cortical pathways , 1983, Trends in Neurosciences.

[7]  E. Switkes,et al.  Deoxyglucose analysis of retinotopic organization in primate striate cortex. , 1982, Science.

[8]  S. Grossberg,et al.  A neural network architecture for preattentive vision , 1989, IEEE Transactions on Biomedical Engineering.

[9]  Yehezkel Yeshurun,et al.  Shape Description with a Space-Variant Sensor: Algorithms for Scan-Path, Fusion, and Convergence Over Multiple Scans , 2015, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Stephen Grossberg,et al.  Synthetic aperture radar processing by a multiple scale neural system for boundary and surface representation , 1995, Neural Networks.

[11]  D. Whitteridge,et al.  The representation of the visual field on the cerebral cortex in monkeys , 1961, The Journal of physiology.

[12]  Andreas G. Andreou,et al.  Modeling inner and outer plexiform retinal processing using nonlinear coupled resistive networks , 1991, Electronic Imaging.

[13]  S. Grossberg Why do parallel cortical systems exist for the perception of static form and moving form? , 1991, Perception & psychophysics.

[14]  S. Grossberg Competition, Decision, and Consensus , 1978 .

[15]  Stephen Grossberg,et al.  Fast Learning VIEWNET Architectures for Recognizing 3-D Objects from Multiple 2-D Views , 1995 .

[16]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[17]  B. Fischer Overlap of receptive field centers and representation of the visual field in the cat's optic tract. , 1973, Vision research.

[18]  S. Grossberg,et al.  Cortical dynamics of form and motion integration: Persistence, apparent motion, and illusory contours , 1996, Vision Research.

[19]  Stephen Grossberg,et al.  ART 2-A: an adaptive resonance algorithm for rapid category learning and recognition , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[20]  Stephen Grossberg,et al.  ARTMAP: supervised real-time learning and classification of nonstationary data by a self-organizing neural network , 1991, [1991 Proceedings] IEEE Conference on Neural Networks for Ocean Engineering.

[21]  Stephen Grossberg,et al.  Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system , 1991, Neural Networks.

[22]  W. D. Ross,et al.  A Neural Theory of Attentive Visual Search : Interactions of Boundary , Surface , Spatial , and Object Representations By : Stephen Grossberg , 2004 .

[23]  Ronald A. Rensink,et al.  Influence of scene-based properties on visual search. , 1990, Science.

[24]  D. Hubel,et al.  Ferrier lecture - Functional architecture of macaque monkey visual cortex , 1977, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[25]  M. Goodale,et al.  Separate visual pathways for perception and action , 1992, Trends in Neurosciences.

[26]  J J Hopfield,et al.  Neurons with graded response have collective computational properties like those of two-state neurons. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Alan N. Gove,et al.  Brightness perception, illusory contours, and corticogeniculate feedback , 1995, Visual Neuroscience.

[28]  R. Parasuraman The attentive brain , 1998 .

[29]  S. Grossberg,et al.  A self-organizing neural network for supervised learning, recognition, and prediction , 1992, IEEE Communications Magazine.

[30]  B. Motter Neural correlates of attentive selection for color or luminance in extrastriate area V4 , 1994, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[31]  Stephen Grossberg,et al.  Absolute stability of global pattern formation and parallel memory storage by competitive neural networks , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[32]  Stephen Grossberg,et al.  A massively parallel architecture for a self-organizing neural pattern recognition machine , 1988, Comput. Vis. Graph. Image Process..

[33]  S. Grossberg,et al.  Pattern Recognition by Self-Organizing Neural Networks , 1991 .

[34]  D. V. van Essen,et al.  A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[35]  Stephen Grossberg,et al.  Fusion Artmap: A Neural Network Architecture for Multi-Channel Data Fusion and Classification , 1993 .

[36]  Patrick Cavanagh,et al.  Image Transforms in the Visual System , 2021, Figural Synthesis.

[37]  S. Grossberg,et al.  Normal and amnesic learning, recognition and memory by a neural model of cortico-hippocampal interactions , 1993, Trends in Neurosciences.

[38]  D. Casasent,et al.  Position, rotation, and scale invariant optical correlation. , 1976, Applied optics.

[39]  Stephen Grossberg,et al.  Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps , 1992, IEEE Trans. Neural Networks.

[40]  M. Bravo,et al.  Preattentive Vision and Perceptual Groups , 1990, Perception.

[41]  Stephen Grossberg,et al.  Neural networks for visual perception in variable illumination , 1988 .

[42]  A Cohen,et al.  Density effects in conjunction search: evidence for a coarse location mechanism of feature integration. , 1991, Journal of experimental psychology. Human perception and performance.

[43]  Stephen Grossberg,et al.  A neural network architecture for figure-ground separation of connected scenic figures , 1991, Neural Networks.

[44]  Stephen Grossberg,et al.  Neural dynamics of adaptive sensory-motor control , 1986 .

[45]  Stephen Grossberg,et al.  Neural dynamics of speech and language coding: developmental programs, perceptual grouping, and competition for short-term memory. , 1986, Human neurobiology.

[46]  James R. Williamson,et al.  Gaussian ARTMAP: A Neural Network for Fast Incremental Learning of Noisy Multidimensional Maps , 1996, Neural Networks.

[47]  Jerome A. Feldman,et al.  Connectionist Models and Their Properties , 1982, Cogn. Sci..

[48]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[49]  G. Johansson Visual motion perception. , 1975, Scientific American.

[50]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[51]  S. Grossberg The Attentive Brain , 1995 .

[52]  B. Kröse,et al.  The control and speed of shifts of attention , 1989, Vision Research.

[53]  Stephen Grossberg,et al.  Integrating Symbolic and Neural Processing in a Self-Organizing Architechture for Pattern Recognition and Prediction , 1993 .

[54]  Grossberg,et al.  A neural architecture for visual motion perception: group and element apparent motion , 1989 .

[55]  Susan L. Franzel,et al.  Guided search: an alternative to the feature integration model for visual search. , 1989, Journal of experimental psychology. Human perception and performance.

[56]  Stephen Grossberg,et al.  Neural dynamics of surface perception: Boundary webs, illuminants, and shape-from-shading , 1987, Comput. Vis. Graph. Image Process..

[57]  Stephen Grossberg,et al.  ART 2-A: An adaptive resonance algorithm for rapid category learning and recognition , 1991, Neural Networks.

[58]  S. Grossberg,et al.  Cortical Dynamics of 3-D Surface Perception: Binocular and Half-Occluded Scenic Images , 1995 .

[59]  Stephen Grossberg,et al.  Cortical dynamics of visual motion perception: short-range and long-range apparent motion. , 1992 .

[60]  Gail A. Carpenter,et al.  ART-EMAP: A Neural Network Architecture for Learning and Prediction by Evidence Accumulation , 1993 .

[61]  S Grossberg,et al.  Some developmental and attentional biases in the contrast enhancement and short term memory of recurrent neural networks. , 1975, Journal of theoretical biology.

[62]  S. Grossberg,et al.  Neural dynamics of 1-D and 2-D brightness perception: A unified model of classical and recent phenomena , 1988, Perception & psychophysics.

[63]  P. A. Kolers Aspects of motion perception , 1972 .

[64]  T. Wiesel,et al.  Functional architecture of macaque monkey visual cortex , 1977 .

[65]  Stephen Grossberg,et al.  Invariant recognition of cluttered scenes by a self-organizing ART architecture: CORT-X boundary segmentation , 1989, Neural Networks.

[66]  T D Albright,et al.  Visual motion perception. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[67]  S Grossberg,et al.  Neural dynamics of perceptual grouping: Textures, boundaries, and emergent segmentations , 1985, Perception & psychophysics.

[68]  S Grossberg,et al.  3-D vision and figure-ground separation by visual cortex , 2010, Perception & psychophysics.

[69]  John H. R. Maunsell,et al.  The visual field representation in striate cortex of the macaque monkey: Asymmetries, anisotropies, and individual variability , 1984, Vision Research.

[70]  Stephen Grossberg,et al.  Fast-learning VIEWNET architectures for recognizing three-dimensional objects from multiple two-dimensional views , 1995, Neural Networks.

[71]  S Grossberg,et al.  Cortical dynamics of three-dimensional form, color, and brightness perception: II. Binocular theory , 1988, Perception & psychophysics.

[72]  Stephen Grossberg,et al.  A Theory of Human Memory: Self-Organization and Performance of Sensory-Motor Codes, Maps, and Plans , 1982 .

[73]  Zijiang J. He,et al.  Surfaces versus features in visual search , 1992, Nature.

[74]  Stephen Grossberg,et al.  A solution of the figure-ground problem for biological vision , 1993, Neural Networks.

[75]  S. Grossberg Contour Enhancement , Short Term Memory , and Constancies in Reverberating Neural Networks , 1973 .

[76]  Allen M. Waxman,et al.  Spreading activation layers, visual saccades, and invariant representations for neural pattern recognition systems , 1989, Neural Networks.

[77]  Stephen Grossberg,et al.  Cortical Dynamics of 3-D Figure-Ground Perception of 2-D Pictures , 1995 .

[78]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[79]  S. Yantis,et al.  Detecting conjunctions of color and form in parallel , 1990, Perception & psychophysics.

[80]  Gail A. Carpenter,et al.  ART-EMAP: A neural network architecture for object recognition by evidence accumulation , 1995, IEEE Trans. Neural Networks.