How Does the Brain Rapidly Learn and Reorganize View- and Positionally-Invariant Object Representations in Inferior Temporal Cortex?

All primates depend for their survival on being able to rapidly learn about and recognize objects. Objects may be visually detected at multiple positions, sizes, and viewpoints. How does the brain rapidly learn and recognize objects while scanning a scene with eye movements, without causing a combinatorial explosion in the number of cells that are needed? How does the brain avoid the problem of erroneously classifying parts of different objects together at the same or different positions in a visual scene? In monkeys and humans, a key area for such invariant object category learning and recognition is the inferotemporal cortex (IT). A neural model is proposed to explain how spatial and object attention coordinate the ability of IT to learn invariant category representations of objects that are seen at multiple positions, sizes, and viewpoints. The model clarifies how interactions within a hierarchy of processing stages in the visual brain accomplish this. These stages include retina, lateral geniculate nucleus, and cortical areas V1, V2, V4, and IT in the brain's What cortical stream, as they interact with spatial attention processes within the parietal cortex of the Where cortical stream. The model builds upon the ARTSCAN model, which proposed how view-invariant object representations are generated. The pARTSCAN model proposes how the following additional processes in the What cortical processing stream also enable positionally-invariant object representations to be learned: IT cells with persistent activity, and a combination of normalizing object category competition and a view-to-object learning law which together ensure that unambiguous views have a larger effect on object recognition than ambiguous views. The model explains how such invariant learning can be fooled when monkeys, or other primates, are presented with an object that is swapped with another object during eye movements to foveate the original object. The swapping procedure is predicted to prevent the reset of spatial attention, which would otherwise keep the representations of multiple objects from being combined by learning. Li & DiCarlo (2008) have presented neurophysiological data from monkeys showing how unsupervised natural experience in a target swapping experiment can rapidly alter object representations in IT. The model quantitatively simulates the swapping data by showing how the swapping procedure fools the spatial attention mechanism. More generally, the model provides a unifying framework, and testable predictions in both monkeys and humans, for understanding object learning data using neurophysiological methods in monkeys and spatial attention, episodic learning, and memory retrieval data …

[1]  D. Whitteridge,et al.  The representation of the visual field on the cerebral cortex in monkeys , 1961, The Journal of physiology.

[2]  A. L. Yarbus,et al.  Eye Movements and Vision , 1967, Springer US.

[3]  S Grossberg,et al.  Some nonlinear networks capable of learning a spatial pattern of arbitrary complexity. , 1968, Proceedings of the National Academy of Sciences of the United States of America.

[4]  F. Werblin Adaptation in a vertebrate retina: intracellular recording in Necturus. , 1971, Journal of neurophysiology.

[5]  D. B. Bender,et al.  Visual properties of neurons in inferotemporal cortex of the Macaque. , 1972, Journal of neurophysiology.

[6]  B. Fischer Overlap of receptive field centers and representation of the visual field in the cat's optic tract. , 1973, Vision research.

[7]  S. Grossberg Contour Enhancement , Short Term Memory , and Constancies in Reverberating Neural Networks , 1973 .

[8]  O. Vinogradova Functional Organization of the Limbic System in the Process of Registration of Information: Facts and Hypotheses , 1975 .

[9]  R. Desimone,et al.  Visual areas in the temporal cortex of the macaque , 1979, Brain Research.

[10]  Eric L. Schwartz,et al.  Computational anatomy and functional architecture of striate cortex: A spatial mapping approach to perceptual coding , 1980, Vision Research.

[11]  S. Grossberg,et al.  How does a brain build a cognitive code? , 1980, Psychological review.

[12]  J. Fuster,et al.  Inferotemporal neurons distinguish and retain behaviorally relevant features of visual stimuli. , 1981, Science.

[13]  E. Switkes,et al.  Deoxyglucose analysis of retinotopic organization in primate striate cortex. , 1982, Science.

[14]  R. Desimone,et al.  Shape recognition and inferior temporal neurons. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[15]  John H. R. Maunsell,et al.  The visual field representation in striate cortex of the macaque monkey: Asymmetries, anisotropies, and individual variability , 1984, Vision Research.

[16]  R. von der Heydt,et al.  Illusory contours and cortical neuron responses. , 1984, Science.

[17]  S Grossberg,et al.  Neural dynamics of perceptual grouping: Textures, boundaries, and emergent segmentations , 1985, Perception & psychophysics.

[18]  S Grossberg,et al.  Cortical dynamics of three-dimensional form, color, and brightness perception: II. Binocular theory , 1988, Perception & psychophysics.

[19]  Stephen Grossberg,et al.  A massively parallel architecture for a self-organizing neural pattern recognition machine , 1988, Comput. Vis. Graph. Image Process..

[20]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[21]  Y. Miyashita,et al.  Neuronal correlate of pictorial short-term memory in the primate temporal cortexYasushi Miyashita , 1988, Nature.

[22]  S. Grossberg,et al.  Neural dynamics of 1-D and 2-D brightness perception: A unified model of classical and recent phenomena , 1988, Perception & psychophysics.

[23]  H. Spitzer,et al.  Increased attention enhances both behavioral and neuronal performance. , 1988, Science.

[24]  R. von der Heydt,et al.  Mechanisms of contour perception in monkey visual cortex. I. Lines of pattern discontinuity , 1989, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[25]  R. von der Heydt,et al.  Mechanisms of contour perception in monkey visual cortex. II. Contours bridging gaps , 1989, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[26]  Stephen Grossberg,et al.  Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system , 1991, Neural Networks.

[27]  W. M. Hart Cognition through Color , 1991, Neurology.

[28]  J. Horton,et al.  The representation of the visual field in human striate cortex. A revision of the classic Holmes map. , 1991, Archives of ophthalmology.

[29]  Allen M. Waxman,et al.  Adaptive 3-D Object Recognition from Multiple Views , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  H. Eichenbaum,et al.  Neuronal activity in the hippocampus during delayed non‐match to sample performance in rats: Evidence for hippocampal processing in recognition memory , 1992, Hippocampus.

[31]  Stephen Grossberg,et al.  Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps , 1992, IEEE Trans. Neural Networks.

[32]  H H Bülthoff,et al.  Psychophysical support for a two-dimensional view interpolation theory of object recognition. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[33]  R. Desimone,et al.  Activity of neurons in anterior inferior temporal cortex during a short- term memory task , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[34]  Gail A. Carpenter,et al.  ART-EMAP: A Neural Network Architecture for Learning and Prediction by Evidence Accumulation , 1993 .

[35]  S. Grossberg,et al.  Normal and amnesic learning, recognition and memory by a neural model of cortico-hippocampal interactions , 1993, Trends in Neurosciences.

[36]  S Grossberg,et al.  3-D vision and figure-ground separation by visual cortex , 2010, Perception & psychophysics.

[37]  N. Logothetis,et al.  View-dependent object recognition by monkeys , 1994, Current Biology.

[38]  Gerald Sommer,et al.  Pattern Recognition by Self-Organizing Neural Networks , 1994 .

[39]  H H Bülthoff,et al.  How are three-dimensional objects represented in the brain? , 1994, Cerebral cortex.

[40]  Minami Ito,et al.  Size and position invariance of neuronal responses in monkey inferotemporal cortex. , 1995, Journal of neurophysiology.

[41]  Stephen Grossberg,et al.  Fast-learning VIEWNET architectures for recognizing three-dimensional objects from multiple two-dimensional views , 1995, Neural Networks.

[42]  C W Tyler,et al.  Mechanisms of Stereoscopic Processing: Stereoattention and Surface Perception in Depth Reconstruction , 1995, Perception.

[43]  E. Rolls,et al.  INVARIANT FACE AND OBJECT RECOGNITION IN THE VISUAL SYSTEM , 1997, Progress in Neurobiology.

[44]  S. Zucker,et al.  Evidence for boundary-specific grouping , 1998, Vision Research.

[45]  Diane C. Rogers-Ramachandran,et al.  Psychophysical evidence for boundary and surface systems in human vision , 1998, Vision Research.

[46]  R. Desimone Visual attention mediated by biased competition in extrastriate visual cortex. , 1998, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[47]  E. Rolls,et al.  View-invariant representations of familiar objects by neurons in the inferior temporal visual cortex. , 1998, Cerebral cortex.

[48]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[49]  Y. Miyashita,et al.  Top-down signal from prefrontal cortex in executive control of memory retrieval , 1999, Nature.

[50]  Victor A. F. Lamme,et al.  Separate processing dynamics for texture elements, boundaries and surfaces in primary visual cortex of the macaque monkey. , 1999, Cerebral cortex.

[51]  D A Pollen,et al.  On the neural correlates of visual perception. , 1999, Cerebral cortex.

[52]  M. Corbetta,et al.  Erratum to “Translocation machinery for synthesis of integral membrane and secretory proteins in dendritic spines” , 2000, Nature Neuroscience.

[53]  Tomaso Poggio,et al.  Models of object recognition , 2000, Nature Neuroscience.

[54]  T. Sejnowski,et al.  Neurocomputational models of working memory , 2000, Nature Neuroscience.

[55]  Xiao-Jing Wang Synaptic reverberation underlying mnemonic persistent activity , 2001, Trends in Neurosciences.

[56]  W. Singer,et al.  Dynamic predictions: Oscillations and synchrony in top–down processing , 2001, Nature Reviews Neuroscience.

[57]  David A. McCormick,et al.  Brain calculus: neural integration and persistent activity , 2001, Nature Neuroscience.

[58]  T. Poggio,et al.  Neural mechanisms of object recognition , 2002, Current Opinion in Neurobiology.

[59]  S. Yantis,et al.  Transient neural activity in human parietal cortex during spatial attention shifts , 2002, Nature Neuroscience.

[60]  S. Grossberg How does the cerebral cortex work? Development, learning, attention, and 3-D vision by laminar circuits of visual cortex. , 2003, Behavioral and cognitive neuroscience reviews.

[61]  D. Amit,et al.  Retrospective and prospective persistent activity induced by Hebbian learning in a recurrent cortical network , 2003, The European journal of neuroscience.

[62]  S. Grossberg,et al.  Towards a theory of the laminar architecture of cerebral cortex: computational clues from the visual system. , 2003, Cerebral cortex.

[63]  Nicolas Brunel,et al.  Dynamics and plasticity of stimulus-selective persistent activity in cortical network models. , 2003, Cerebral cortex.

[64]  S. Grossberg,et al.  Adaptive pattern classification and universal recoding: I. Parallel development and coding of neural feature detectors , 1976, Biological Cybernetics.

[65]  S. Grossberg,et al.  Laminar cortical dynamics of 3D surface perception: Stratification, transparency, and neon color spreading , 2005, Vision Research.

[66]  Stephen Grossberg,et al.  A laminar cortical model of stereopsis and 3D surface perception: closure and da Vinci stereopsis. , 2004, Spatial vision.

[67]  S. Grossberg,et al.  Neural dynamics of autistic behaviors: cognitive, emotional, and timing substrates. , 2006, Psychological review.

[68]  Stephen Grossberg,et al.  Consciousness CLEARS the mind , 2007, Neural Networks.

[69]  Thomas Serre,et al.  A feedforward architecture accounts for rapid categorization , 2007, Proceedings of the National Academy of Sciences.

[70]  S. Grossberg,et al.  Texture segregation by visual cortex: Perceptual grouping, attention, and learning , 2007, Vision Research.

[71]  Thomas Serre,et al.  A quantitative theory of immediate visual recognition. , 2007, Progress in brain research.

[72]  F. Windels,et al.  Neuronal activity , 2006, Molecular Neurobiology.

[73]  S. Grossberg,et al.  Spikes, synchrony, and attentive learning by laminar thalamocortical circuits , 2006, Brain Research.

[74]  P. Cavanagh,et al.  Visual short-term memory operates more efficiently on boundary features than on surface features , 2008, Perception & psychophysics.

[75]  M. Moscovitch,et al.  Top-down and bottom-up attention to memory: A hypothesis (AtoM) on the role of the posterior parietal cortex in memory retrieval , 2008, Neuropsychologia.

[76]  S. Gerber,et al.  Unsupervised Natural Experience Rapidly Alters Invariant Object Representation in Visual Cortex , 2008 .

[77]  M. Moscovitch,et al.  The parietal cortex and episodic memory: an attentional account , 2008, Nature Reviews Neuroscience.

[78]  M. Tsodyks,et al.  Synaptic Theory of Working Memory , 2008, Science.

[79]  S. Grossberg,et al.  View-invariant object category learning, recognition, and search: How spatial and object attention are coordinated using surface-based attentional shrouds , 2009, Cognitive Psychology.

[80]  D. Heeger,et al.  The Normalization Model of Attention , 2009, Neuron.

[81]  Stephen Grossberg,et al.  ARTSCENE: A neural system for natural scene classification. , 2009, Journal of vision.

[82]  S. Grossberg Cortical and subcortical predictive dynamics and learning during perception, cognition, emotion and action , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[83]  S. Yantis,et al.  A Domain-Independent Source of Cognitive Control for Task Sets: Shifting Spatial Attention and Switching Categorization Rules , 2009, The Journal of Neuroscience.

[84]  S. Grossberg,et al.  From stereogram to surface: how the brain sees the world in depth. , 2009, Spatial vision.

[85]  Stephen Grossberg,et al.  Where's Waldo? How the brain earns to categorize and discover desired objects in a cluttered scene , 2010 .

[86]  J. DiCarlo,et al.  Unsupervised Natural Visual Experience Rapidly Reshapes Size-Invariant Object Representation in Inferior Temporal Cortex , 2010, Neuron.

[87]  Nicholas C. Foley,et al.  Neural Dynamics of Object-based Multifocal Visual Spatial Attention and Priming: Object Cueing, Useful-field-of-view, and Crowding Cognitive Psychology , 2012 .

[88]  R. K. Simpson Nature Neuroscience , 2022 .