Representations of regular and irregular shapes by deep Convolutional Neural Networks, monkey inferotemporal neurons and human judgments

Recent studies suggest that deep Convolutional Neural Network (CNN) models show higher representational similarity, compared to any other existing object recognition models, with macaque inferior temporal (IT) cortical responses, human ventral stream fMRI activations and human object recognition. These studies employed natural images of objects. A long research tradition employed abstract shapes to probe the selectivity of IT neurons. If CNN models provide a realistic model of IT responses, then they should capture the IT selectivity for such shapes. Here, we compare the activations of CNN units to a stimulus set of 2D regular and irregular shapes with the response selectivity of macaque IT neurons and with human similarity judgements. The shape set consisted of regular shapes that differed in nonaccidental properties, and irregular, asymmetrical shapes with curved or straight boundaries. We found that deep CNNs (Alexnet, VGG-16 and VGG-19) that were trained to classify natural images show response modulations to these shapes that were similar to those of IT neurons. Untrained CNNs with the same architecture than trained CNNs, but with random weights, demonstrated a poorer similarity than CNNs trained in classification. The difference between the trained and untrained CNNs emerged at the deep convolutional layers, where the similarity between the shape-related response modulations of IT neurons and the trained CNNs was high. Unlike IT neurons, human similarity judgements of the same shapes correlated best with the last layers of the trained CNNs. In particular, these deepest layers showed an enhanced sensitivity for straight versus curved irregular shapes, similar to that shown in human shape judgments. In conclusion, the representations of abstract shape similarity are highly comparable between macaque IT neurons and deep convolutional layers of CNNs that were trained to classify natural images, while human shape similarity judgments correlate better with the deepest layers.

[1]  Edward A. Wasserman,et al.  How Animals See the World: Comparative Behavior, Biology, and Evolution of Vision , 2012 .

[2]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3]  Juan de Lara,et al.  Supporting user-oriented analysis for multi-view domain-specific visual languages , 2009, Inf. Softw. Technol..

[4]  I. Biederman,et al.  Representation of regular and irregular shapes in macaque inferotemporal cortex. , 2005, Cerebral cortex.

[5]  Nikolaus Kriegeskorte,et al.  Deep neural networks: a new framework for modelling biological vision and brain information processing , 2015, bioRxiv.

[6]  R. Vogels,et al.  Properties of shape tuning of macaque inferior temporal neurons examined using rapid serial visual presentation. , 2007, Journal of neurophysiology.

[7]  I. Biederman,et al.  Shape Tuning in Macaque Inferior Temporal Cortex , 2003, The Journal of Neuroscience.

[8]  T. Vighneshvel,et al.  Coding of relative size in monkey inferotemporal cortex , 2015, Journal of neurophysiology.

[9]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[10]  S Edelman,et al.  Representation is representation of similarities , 1996, Behavioral and Brain Sciences.

[11]  Doris Y. Tsao,et al.  The Code for Facial Identity in the Primate Brain , 2017, Cell.

[12]  N. Lorente,et al.  Controlled spin switching in a metallocene molecular junction , 2017, Nature Communications.

[13]  Rufin Vogels Neural Mechanisms of Object Recognition in Nonhuman Primates , 2012 .

[14]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[15]  Nikolaus Kriegeskorte,et al.  Frontiers in Systems Neuroscience Systems Neuroscience , 2022 .

[16]  Jonas Kubilius,et al.  Deep Neural Networks as a Computational Model for Human Shape Sensitivity , 2016, PLoS Comput. Biol..

[17]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[18]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[19]  Benjamin B. Kimia,et al.  On the role of medial geometry in human vision , 2003, Journal of Physiology-Paris.

[20]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  I. Biederman,et al.  Inferior Temporal Neurons Show Greater Sensitivity to Nonaccidental than to Metric Shape Differences , 2001, Journal of Cognitive Neuroscience.

[22]  Marcel A. J. van Gerven,et al.  Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream , 2014, The Journal of Neuroscience.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  G. Orban,et al.  Cue-invariant shape selectivity of macaque inferior temporal neurons. , 1993, Science.

[25]  C R Olson,et al.  Mirror-image confusion in single neurons of the macaque inferotemporal cortex. , 2000, Science.

[26]  Irving Biederman,et al.  One-shot viewpoint invariance in matching novel objects , 1999, Vision Research.

[27]  Charles E Connor,et al.  Underlying principles of visual shape selectivity in posterior inferotemporal cortex , 2004, Nature Neuroscience.

[28]  R. Shepard,et al.  Second-order isomorphism of internal representations: Shapes of states ☆ , 1970 .

[29]  Ha Hong,et al.  Explicit information for category-orthogonal object properties increases along the ventral stream , 2016, Nature Neuroscience.

[30]  R. Vogels Categorization of complex visual images by rhesus monkeys. Part 1: behavioural study , 1999, The European journal of neuroscience.

[31]  Tomoyasu Horikawa,et al.  Generic decoding of seen and imagined objects using hierarchical visual features , 2015, Nature Communications.

[32]  Antonio Torralba,et al.  Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence , 2016, Scientific Reports.

[33]  James J DiCarlo,et al.  Multiple Object Response Normalization in Monkey Inferotemporal Cortex , 2005, The Journal of Neuroscience.

[34]  R. Vogels,et al.  Inferotemporal neurons represent low-dimensional configurations of parameterized shapes , 2001, Nature Neuroscience.

[35]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[36]  C. Gross,et al.  Neural ensemble coding in inferior temporal cortex. , 1994, Journal of neurophysiology.

[37]  S. Arun,et al.  Selective IT neurons are selective along many dimensions , 2016, Journal of neurophysiology.

[38]  Daniel L. K. Yamins,et al.  Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..

[39]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[40]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[41]  James A. Hampton,et al.  Similarity and Categorization , 2001 .

[42]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[43]  Rufin Vogels,et al.  Shape Selectivity of Middle Superior Temporal Sulcus Body Patch Neurons , 2017, eNeuro.

[44]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[45]  I. Biederman,et al.  Recognizing depth-rotated objects: evidence and conditions for three-dimensional viewpoint invariance. , 1993, Journal of experimental psychology. Human perception and performance.

[46]  Li Su,et al.  A Toolbox for Representational Similarity Analysis , 2014, PLoS Comput. Biol..

[47]  R. Desimone,et al.  Stimulus-selective properties of inferior temporal neurons in the macaque , 1984, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[48]  D. B. Bender,et al.  Visual properties of neurons in inferotemporal cortex of the Macaque. , 1972, Journal of neurophysiology.

[49]  U. Hahn,et al.  Similarity and categorization , 2001 .

[50]  Matthias Bethge,et al.  Comparing deep neural networks against humans: object recognition when the signal gets weaker , 2017, ArXiv.

[51]  Michael Eickenberg,et al.  Seeing it all: Convolutional network layers map the function of the human visual system , 2017, NeuroImage.

[52]  Carl R Olson,et al.  Responses to Compound Objects in Monkey Inferotemporal Cortex: The Whole Is Equal to the Sum of the Discrete Parts , 2010, The Journal of Neuroscience.

[53]  R. Desimone,et al.  Shape recognition and inferior temporal neurons. , 1983, Proceedings of the National Academy of Sciences of the United States of America.