Similarity judgments and cortical visual responses reflect different properties of object and scene categories in naturalistic images

Numerous factors have been reported to underlie the representation of complex images in high-level human visual cortex, including categories (e.g. faces, objects, scenes), animacy, and real-world size, but the extent to which this organization reflects behavioral judgments of real-world stimuli is unclear. Here, we compared representations derived from explicit behavioral similarity judgments and ultra-high field (7T) fMRI of human visual cortex for multiple exemplars of a diverse set of naturalistic images from 48 object and scene categories. While there was a significant correlation between similarity judgments and fMRI responses, there were striking differences between the two representational spaces. Behavioral judgements primarily revealed a coarse division between man-made (including humans) and natural (including animals) images, with clear groupings of conceptually-related categories (e.g. transportation, animals), while these conceptual groupings were largely absent in the fMRI representations. Instead, fMRI responses primarily seemed to reflect a separation of both human and non-human faces/bodies from all other categories. Further, comparison of the behavioral and fMRI representational spaces with those derived from the layers of a deep neural network (DNN) showed a strong correspondence with behavior in the top-most layer and with fMRI in the mid-level layers. These results suggest a complex relationship between localized responses in high-level visual cortex and behavioral similarity judgments - each domain reflects different properties of the images, and responses in high-level visual cortex may correspond to intermediate stages of processing between basic visual features and the conceptual categories that dominate the behavioral response.

[1]  Keiji Tanaka,et al.  Matching Categorical Object Representations in Inferior Temporal Cortex of Man and Monkey , 2008, Neuron.

[2]  Stephen M. Smith,et al.  Threshold-free cluster enhancement: Addressing problems of smoothing, threshold dependence and localisation in cluster inference , 2009, NeuroImage.

[3]  Chris I Baker,et al.  Contributions of low- and high-level properties to neural processing of visual scenes in the human brain , 2017, Philosophical Transactions of the Royal Society B: Biological Sciences.

[4]  Morgan D Barense,et al.  Integrative and distinctive coding of visual and conceptual object features in the ventral visual stream , 2018, eLife.

[5]  Russell A. Epstein Parahippocampal and retrosplenial contributions to human spatial navigation , 2008, Trends in Cognitive Sciences.

[6]  Tolga Çukur,et al.  Functional Subdomains within Human FFA , 2013, The Journal of Neuroscience.

[7]  N. Kriegeskorte,et al.  Inverse MDS: Inferring Dissimilarity Structure from Multiple Item Arrangements , 2012, Front. Psychology.

[8]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[9]  Jack L. Gallant,et al.  Encoding and decoding in fMRI , 2011, NeuroImage.

[10]  J. Gallant,et al.  Cortical representation of animate and inanimate objects in complex natural scenes , 2012, Journal of Physiology-Paris.

[11]  Jack L. Gallant,et al.  A Continuous Semantic Space Describes the Representation of Thousands of Object and Action Categories across the Human Brain , 2012, Neuron.

[12]  Nikolaus Kriegeskorte,et al.  Deep neural networks: a new framework for modelling biological vision and brain information processing , 2015, bioRxiv.

[13]  J. S. Guntupalli,et al.  A Model of Representational Spaces in Human Cortex , 2016, Cerebral cortex.

[14]  Samuel A. Nastase,et al.  Modeling Semantic Encoding in a Common Neural Representational Space , 2018, bioRxiv.

[15]  Jeremy Freeman,et al.  Orientation Decoding Depends on Maps, Not Columns , 2011, The Journal of Neuroscience.

[16]  Tom Hartley,et al.  A data driven approach to understanding the organization of high-level visual cortex , 2017, Scientific Reports.

[17]  Marcel A. J. van Gerven,et al.  Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream , 2014, The Journal of Neuroscience.

[18]  Alex Clarke,et al.  Integrated deep visual and semantic attractor neural networks predict fMRI pattern-information along the ventral object processing pathway , 2018, Scientific Reports.

[19]  Daria Proklova,et al.  Disentangling Representations of Object Shape and Object Category in Human Visual Cortex: The Animate–Inanimate Distinction , 2016, Journal of Cognitive Neuroscience.

[20]  Umut Güçlü,et al.  Representations of naturalistic stimulus complexity in early and associative visual and auditory cortices , 2018, Scientific Reports.

[21]  Dwight J. Kravitz,et al.  A Retinotopic Basis for the Division of High-Level Scene Processing between Lateral and Ventral Human Occipitotemporal Cortex , 2015, The Journal of Neuroscience.

[22]  Dwight J. Kravitz,et al.  Task context impacts visual object processing differentially across the cortex , 2014, Proceedings of the National Academy of Sciences.

[23]  Shlomo Bentin,et al.  Stimulus type, level of categorization, and spatial-frequencies utilization: implications for perceptual categorization hierarchies. , 2009, Journal of experimental psychology. Human perception and performance.

[24]  Dwight J. Kravitz,et al.  Real-World Scene Representations in High-Level Visual Cortex: It's the Spaces More Than the Places , 2011, The Journal of Neuroscience.

[25]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[26]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[27]  C. Baker,et al.  Scene-Selectivity and Retinotopy in Medial Parietal Cortex , 2016, Front. Hum. Neurosci..

[28]  Junxing Shi,et al.  Deep Residual Network Predicts Cortical Representation and Organization of Visual Features for Rapid Categorization , 2018, Scientific Reports.

[29]  Li Fei-Fei,et al.  Typicality sharpens category representations in object-selective cortex , 2016, NeuroImage.

[30]  George L. Malcolm,et al.  Making Sense of Real-World Scenes , 2016, Trends in Cognitive Sciences.

[31]  Steen Moeller,et al.  T 1 weighted brain images at 7 Tesla unbiased for Proton Density, T 2 ⁎ contrast and RF coil receive B 1 sensitivity with simultaneous vessel visualization , 2009, NeuroImage.

[32]  Alex Martin GRAPES—Grounding representations in action, perception, and emotion systems: How object properties and categories are represented in the human brain , 2015, Psychonomic Bulletin & Review.

[33]  A. Oliva,et al.  Dr. Angry and Mr. Smile: when categorization flexibly modifies the perception of faces in rapid visual presentations , 1999, Cognition.

[34]  K. Grill-Spector,et al.  The functional architecture of the ventral temporal cortex and its role in categorization , 2014, Nature Reviews Neuroscience.

[35]  Yaoda Xu,et al.  Goal-Directed Visual Processing Differentially Impacts Human Ventral and Dorsal Visual Representations , 2017, The Journal of Neuroscience.

[36]  Chris I. Baker,et al.  Evaluating the correspondence between face-, scene-, and object-selectivity and retinotopic organization within lateral occipitotemporal cortex , 2016, Journal of vision.

[37]  Li Fei-Fei,et al.  Distinct contributions of functional and deep neural network features to representational similarity of scenes in human brain and behavior , 2018, eLife.

[38]  Antonio Torralba,et al.  Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence , 2016, Scientific Reports.

[39]  Nikolaus Kriegeskorte,et al.  The spatiotemporal neural dynamics underlying perceived similarity for real-world objects , 2019, NeuroImage.

[40]  H. P. Op de Beeck,et al.  Task Context Overrules Object- and Category-Related Representational Content in the Human Parietal Cortex , 2017, Cerebral cortex.

[41]  James V. Haxby,et al.  CoSMoMVPA: Multi-Modal Multivariate Pattern Analysis of Neuroimaging Data in Matlab/GNU Octave , 2016, bioRxiv.

[42]  P. Downing,et al.  Category selectivity in human visual cortex: Beyond visual object recognition , 2017, Neuropsychologia.

[43]  H. Steven Scholte,et al.  Fantastic DNimals and where to find them , 2017, NeuroImage.

[44]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Alexander G. Huth,et al.  Functional Subdomains within Scene-Selective Cortex: Parahippocampal Place Area, Retrosplenial Complex, and Occipital Place Area , 2016, The Journal of Neuroscience.

[46]  Radoslaw Martin Cichy,et al.  The representational dynamics of task and object processing in humans , 2018, eLife.

[47]  Bryan Tripp A deeper understanding of the brain , 2018, NeuroImage.

[48]  Rafael Malach,et al.  Large-Scale Mirror-Symmetry Organization of Human Occipito-Temporal Object Areas , 2003, Neuron.

[49]  J. DiCarlo,et al.  Using goal-driven deep learning models to understand sensory cortex , 2016, Nature Neuroscience.

[50]  Michelle R. Greene,et al.  Visual scenes are categorized by function. , 2016, Journal of experimental psychology. General.

[51]  Bryan R. Conroy,et al.  A Common, High-Dimensional Model of the Representational Space in Human Ventral Temporal Cortex , 2011, Neuron.

[52]  Yizhen Zhang,et al.  Neural Encoding and Decoding with Deep Learning for Dynamic Natural Vision , 2016, Cerebral cortex.

[53]  Dwight J. Kravitz,et al.  The ventral visual pathway: an expanded neural framework for the processing of object quality , 2013, Trends in Cognitive Sciences.

[54]  Dwight J. Kravitz,et al.  High-level visual object representations are constrained by position. , 2010, Cerebral cortex.

[55]  Nikolaus Kriegeskorte,et al.  The Emergence of Semantic Meaning in the Ventral Temporal Pathway , 2014, Journal of Cognitive Neuroscience.

[56]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[57]  L. Tyler,et al.  Predicting the Time Course of Individual Objects with MEG , 2014, Cerebral cortex.

[58]  Russell A. Epstein,et al.  Computational mechanisms underlying cortical responses to the affordance properties of visual scenes , 2017, bioRxiv.

[59]  D. Heeger,et al.  Two Retinotopic Visual Areas in Human Lateral Occipital Cortex , 2006, The Journal of Neuroscience.

[60]  R. Goebel,et al.  Human Object-Similarity Judgments Reflect and Transcend the Primate-IT Object Representation , 2013, Front. Psychol..

[61]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[62]  Dimitrios Pantazis,et al.  Dynamics of scene representations in the human brain revealed by magnetoencephalography and deep neural networks , 2015, NeuroImage.

[63]  Yaroslav O. Halchenko,et al.  The Animacy Continuum in the Human Ventral Vision Pathway , 2015, Journal of Cognitive Neuroscience.

[64]  Talma Hendler,et al.  Eccentricity Bias as an Organizing Principle for Human High-Order Object Areas , 2002, Neuron.

[65]  Fei-Fei Li,et al.  Basic Level Category Structure Emerges Gradually across Human Ventral Visual Cortex , 2015, Journal of Cognitive Neuroscience.

[66]  T. Rogers,et al.  Where do you know what you know? The representation of semantic knowledge in the human brain , 2007, Nature Reviews Neuroscience.

[67]  Chris I. Baker,et al.  The temporal evolution of conceptual object representations revealed through models of behavior, semantics and deep neural networks , 2017, NeuroImage.

[68]  A. Oliva,et al.  A Real-World Size Organization of Object Responses in Occipitotemporal Cortex , 2012, Neuron.

[69]  Marcel Adam Just,et al.  Brain reading and behavioral methods provide complementary perspectives on the representation of concepts , 2019, NeuroImage.

[70]  J. S. Guntupalli,et al.  The Representation of Biological Classes in the Human Brain , 2012, The Journal of Neuroscience.

[71]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[72]  Stefania Bracci,et al.  The ventral visual pathway represents animal appearance over animacy, unlike human behavior and deep neural networks , 2017, bioRxiv.

[73]  L. Tyler,et al.  Object-Specific Semantic Coding in Human Perirhinal Cortex , 2014, The Journal of Neuroscience.

[74]  Li Su,et al.  A Toolbox for Representational Similarity Analysis , 2014, PLoS Comput. Biol..

[75]  N. Kanwisher Functional specificity in the human brain: A window into the functional architecture of the mind , 2010, Proceedings of the National Academy of Sciences.

[76]  J. Duncan,et al.  Discrimination of Visual Categories Based on Behavioral Relevance in Widespread Regions of Frontoparietal Cortex , 2015, The Journal of Neuroscience.

[77]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.