Hard-wired feed-forward visual mechanisms of the brain compensate for affine variations in object recognition

Humans perform object recognition effortlessly and accurately. However, it is unknown how the visual system copes with variations in objects' appearance and the environmental conditions. Previous studies have suggested that affine variations such as size and position are compensated for in the feed-forward sweep of visual information processing while feedback signals are needed for precise recognition when encountering non-affine variations such as pose and lighting. Yet, no empirical data exist to support this suggestion. We systematically investigated the impact of the above-mentioned affine and non-affine variations on the categorization performance of the feed-forward mechanisms of the human brain. For that purpose, we designed a backward-masking behavioral categorization paradigm as well as a passive viewing EEG recording experiment. On a set of varying stimuli, we found that the feed-forward visual pathways contributed more dominantly to the compensation of variations in size and position compared to lighting and pose. This was reflected in both the amplitude and the latency of the category separability indices obtained from the EEG signals. Using a feed-forward computational model of the ventral visual stream, we also confirmed a more dominant role for the feed-forward visual mechanisms of the brain in the compensation of affine variations. Taken together, our experimental results support the theory that non-affine variations such as pose and lighting may need top-down feedback information from higher areas such as IT and PFC for precise object recognition.

[1]  Blake W. Johnson,et al.  A high density ERP comparison of mental rotation and mental size transformation , 2003, Brain and Cognition.

[2]  James J DiCarlo,et al.  A rodent model for the study of invariant visual object recognition , 2009, Proceedings of the National Academy of Sciences.

[3]  Steven J. Luck,et al.  ERPLAB: an open-source toolbox for the analysis of event-related potentials , 2014, Front. Hum. Neurosci..

[4]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  R. Desimone,et al.  Stimulus-selective properties of inferior temporal neurons in the macaque , 1984, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[6]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[7]  N. Logothetis,et al.  Shape representation in the inferior temporal cortex of monkeys , 1995, Current Biology.

[8]  I. Biederman,et al.  Effects of varying stimulus size on object recognition in pigeons. , 2006, Journal of experimental psychology. Animal behavior processes.

[9]  Ha Hong,et al.  Explicit information for category-orthogonal object properties increases along the ventral stream , 2016, Nature Neuroscience.

[10]  D. J. Felleman,et al.  Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.

[11]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[12]  Nasour Bagheri,et al.  Average activity, but not variability, is the dominant factor in the representation of object categories in the brain , 2017, Neuroscience.

[13]  Mohammad Reza Daliri,et al.  Decoding Objects of Basic Categories from Electroencephalographic Signals Using Wavelet Transform and Support Vector Machines , 2014, Brain Topography.

[14]  A. Thierry,et al.  Motor and Cognitive Functions of the Prefrontal Cortex , 1994, Research and Perspectives in Neurosciences.

[15]  Daniel L K Yamins,et al.  Neural Mechanisms Underlying Visual Object Recognition. , 2014, Cold Spring Harbor symposia on quantitative biology.

[16]  Denis Fize,et al.  Speed of processing in the human visual system , 1996, Nature.

[17]  Shimon Edelman,et al.  Class similarity and viewpoint invariance in the recognition of 3D objects , 1995, Biological Cybernetics.

[18]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[19]  Reza Ebrahimpour,et al.  Feedforward object-vision models only tolerate small image variations compared to human , 2014, Front. Comput. Neurosci..

[20]  J. M. Hupé,et al.  Cortical feedback improves discrimination between figure and background by V1, V2 and V3 neurons , 1998, Nature.

[21]  D H Brainard,et al.  The Psychophysics Toolbox. , 1997, Spatial vision.

[22]  P. Milner A model for visual shape recognition. , 1974, Psychological review.

[23]  Simon J Thorpe,et al.  Animals roll around the clock: the rotation invariance of ultrarapid visual processing. , 2006, Journal of vision.

[24]  A. Nambu,et al.  No-go activity in the frontal association cortex of human subjects , 1993, Neuroscience Research.

[25]  G. Kreiman,et al.  Timing, Timing, Timing: Fast Decoding of Object Information from Intracranial Field Potentials in Human Visual Cortex , 2009, Neuron.

[26]  A. Mognon,et al.  ADJUST: An automatic EEG artifact detector based on the joint use of spatial and temporal features. , 2011, Psychophysiology.

[27]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[28]  P. Mcmullen,et al.  Effects of orientation on the identification of rotated objects depend on the level of identity. , 1998, Journal of experimental psychology. Human perception and performance.

[29]  David C. Plaut,et al.  ‘What’ Is Happening in the Dorsal Visual Pathway , 2016, Trends in Cognitive Sciences.

[30]  David J. Freedman,et al.  Preferential Encoding of Visual Categories in Parietal Cortex Compared to Prefrontal Cortex , 2011, Nature Neuroscience.

[31]  Radoslaw Martin Cichy,et al.  Resolving human object recognition in space and time , 2014, Nature Neuroscience.

[32]  D A Pollen,et al.  On the neural correlates of visual perception. , 1999, Cerebral cortex.

[33]  I. Biederman,et al.  Effects of illumination intensity and direction on object coding in macaque inferior temporal cortex. , 2002, Cerebral cortex.

[34]  Arnaud Delorme,et al.  EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis , 2004, Journal of Neuroscience Methods.

[35]  V. Lamme,et al.  The distinct modes of vision offered by feedforward and recurrent processing , 2000, Trends in Neurosciences.

[36]  Wendy L. Braje,et al.  Illumination effects in face recognition , 1998, Psychobiology.

[37]  Tomaso Poggio,et al.  Fast Readout of Object Identity from Macaque Inferior Temporal Cortex , 2005, Science.

[38]  Victor A. F. Lamme,et al.  Feedforward, horizontal, and feedback processing in the visual cortex , 1998, Current Opinion in Neurobiology.

[39]  Daniel L. K. Yamins,et al.  Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..

[40]  Rufin VanRullen,et al.  The power of the feed-forward sweep , 2008, Advances in cognitive psychology.

[41]  A. Hendrickson,et al.  Distribution of cones in human and monkey retina: individual variability and radial asymmetry. , 1987, Science.

[42]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[43]  Thomas A. Carlson,et al.  Representational dynamics of object recognition: Feedforward and feedback information flows , 2016, NeuroImage.

[44]  Mohammad Reza Daliri,et al.  EEG phase patterns reflect the representation of semantic categories of objects , 2015, Medical & Biological Engineering & Computing.

[45]  Erich Schröger,et al.  Filter Effects and Filter Artifacts in the Analysis of Electrophysiological Data , 2012, Front. Psychology.

[46]  Tim Curran,et al.  The Limits of Feedforward Vision: Recurrent Processing Promotes Robust Object Recognition when Objects Are Degraded , 2012, Journal of Cognitive Neuroscience.

[47]  Uri Polat,et al.  Temporal resolution of visual processing in action video game players , 2010 .

[48]  Thomas Serre,et al.  A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex , 2005 .

[49]  Thomas Serre,et al.  A feedforward architecture accounts for rapid categorization , 2007, Proceedings of the National Academy of Sciences.

[50]  Margot J. Taylor,et al.  N170 or N1? Spatiotemporal differences between object and face processing using ERPs. , 2004, Cerebral cortex.

[51]  Thomas A. Carlson,et al.  Emerging Object Representations in the Visual System Predict Reaction Times for Categorization , 2015, PLoS Comput. Biol..

[52]  M. Corballis,et al.  Decisions about identity and orientation of rotated letters and digits , 1978, Memory & cognition.

[53]  David J. Freedman,et al.  Independent Category and Spatial Encoding in Parietal Cortex , 2013, Neuron.

[54]  T. Poggio,et al.  What and where: A Bayesian inference theory of attention , 2010, Vision Research.

[55]  S. Thorpe,et al.  Rapid categorization of natural images by rhesus monkeys , 1998, Neuroreport.

[56]  H. Bülthoff,et al.  Face recognition under varying poses: The role of texture and shape , 1996, Vision Research.

[57]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[58]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[59]  David D. Cox,et al.  Untangling invariant object recognition , 2007, Trends in Cognitive Sciences.

[60]  Keiji Tanaka,et al.  Inferotemporal cortex and object vision. , 1996, Annual review of neuroscience.

[61]  Dwight J. Kravitz,et al.  How position dependent is visual object recognition? , 2008, Trends in Cognitive Sciences.

[62]  P. Jolicoeur A size-congruency effect in memory for visual shape , 1987, Memory & cognition.

[63]  J. Bullier Integrated model of visual processing , 2001, Brain Research Reviews.

[64]  Tomaso Poggio,et al.  Generalization in vision and motor control , 2004, Nature.

[65]  T. Carlson,et al.  High temporal resolution decoding of object position and category. , 2011, Journal of vision.

[66]  Doris Y. Tsao,et al.  Functional Compartmentalization and Viewpoint Generalization Within the Macaque Face-Processing System , 2010, Science.

[67]  Joel Z. Leibo,et al.  The dynamics of invariant object recognition in the human visual system. , 2014, Journal of neurophysiology.

[68]  A. Norcia,et al.  A Representational Similarity Analysis of the Dynamics of Object Processing Using Single-Trial EEG Classification , 2015, PloS one.

[69]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[70]  B. Rossion,et al.  ERP evidence for the speed of face categorization in the human brain: Disentangling the contribution of low-level visual cues from face perception , 2011, Vision Research.

[71]  D. Perrett,et al.  EFFECT OF IMAGE ORIENTATION AND SIZE ON OBJECT RECOGNITION: RESPONSES OF SINGLE UNITS IN THE MACAQUE MONKEY TEMPORAL CORTEX , 2000, Cognitive neuropsychology.

[72]  Sidney R. Lehky,et al.  Frontiers in Computational Neuroscience Computational Neuroscience , 2022 .

[73]  E. Rolls,et al.  View-invariant representations of familiar objects by neurons in the inferior temporal visual cortex. , 1998, Cerebral cortex.