Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream

Converging evidence suggests that the primate ventral visual pathway encodes increasingly complex stimulus features in downstream areas. We quantitatively show that there indeed exists an explicit gradient for feature complexity in the ventral pathway of the human brain. This was achieved by mapping thousands of stimulus features of increasing complexity across the cortical sheet using a deep neural network. Our approach also revealed a fine-grained functional specialization of downstream areas of the ventral stream. Furthermore, it allowed decoding of representations from human brain activity at an unsurpassed degree of accuracy, confirming the quality of the developed approach. Stimulus features that successfully explained neural responses indicate that population receptive fields were explicitly tuned for object categorization. This provides strong support for the hypothesis that object categorization is a guiding principle in the functional organization of the primate ventral stream.

[1]  David D. Cox,et al.  Untangling invariant object recognition , 2007, Trends in Cognitive Sciences.

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[4]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[5]  J. Hegdé,et al.  A comparative study of shape representation in macaque visual areas v2 and v4. , 2007, Cerebral cortex.

[6]  S. Roweis,et al.  Nonparametric Bayesian Biclustering , 2007 .

[7]  Keiji Tanaka,et al.  Inferotemporal cortex and object vision. , 1996, Annual review of neuroscience.

[8]  T. Rogers,et al.  Where do you know what you know? The representation of semantic knowledge in the human brain , 2007, Nature Reviews Neuroscience.

[9]  Y Kamitani,et al.  Neural Decoding of Visual Imagery During Sleep , 2013, Science.

[10]  Brian A. Wandell,et al.  Population receptive field estimates in human visual cortex , 2008, NeuroImage.

[11]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[12]  D. Heeger,et al.  Two Retinotopic Visual Areas in Human Lateral Occipital Cortex , 2006, The Journal of Neuroscience.

[13]  K. Grill-Spector,et al.  The functional architecture of the ventral temporal cortex and its role in categorization , 2014, Nature Reviews Neuroscience.

[14]  Jack L. Gallant,et al.  Encoding and decoding in fMRI , 2011, NeuroImage.

[15]  Li Zhaoping,et al.  Understanding Vision: Theory, Models, and Data , 2014 .

[16]  David J. Freedman,et al.  Task Dependence of Visual and Category Representations in Prefrontal and Inferior Temporal Cortices , 2014, The Journal of Neuroscience.

[17]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[18]  Tom Heskes,et al.  Neural Decoding with Hierarchical Generative Models , 2010, Neural Computation.

[19]  Christopher N. Johnson,et al.  Return of the devil , 2016 .

[20]  J. Gallant,et al.  Complete functional characterization of sensory neurons by system identification. , 2006, Annual review of neuroscience.

[21]  Nicole C. Rust,et al.  In praise of artifice , 2005, Nature Neuroscience.

[22]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[23]  H. Esteky,et al.  Behavioral demand modulates object category representation in the inferior temporal cortex. , 2014, Journal of Neurophysiology.

[24]  Nikos K Logothetis,et al.  Interpreting the BOLD signal. , 2004, Annual review of physiology.

[25]  F. D. Lange,et al.  Shape Perception Simultaneously Up- and Downregulates Neural Activity in the Primary Visual Cortex , 2014, Current Biology.

[26]  Marcel van Gerven,et al.  Unsupervised Feature Learning Improves Prediction of Human Brain Activity in Response to Natural Images , 2014, PLoS Comput. Biol..

[27]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[28]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[29]  F. Tong,et al.  Decoding reveals the contents of visual working memory in early visual areas , 2009, Nature.

[30]  M. Goodale,et al.  Separate visual pathways for perception and action , 1992, Trends in Neurosciences.

[31]  S. Hochstein,et al.  View from the Top Hierarchies and Reverse Hierarchies in the Visual System , 2002, Neuron.

[32]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[33]  Brandon M. Turner,et al.  Model-based cognitive neuroscience. , 2016, Journal of mathematical psychology.

[34]  Geoffrey E. Hinton Learning multiple layers of representation , 2007, Trends in Cognitive Sciences.

[35]  N. Logothetis,et al.  The Effect of Learning on the Function of Monkey Extrastriate Visual Cortex , 2004, PLoS biology.

[36]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Ha Hong,et al.  Hierarchical Modular Optimization of Convolutional Networks Achieves Representations Similar to Macaque IT and Human Ventral Stream , 2013, NIPS.

[38]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[39]  A. T. Smith,et al.  Estimating receptive field size from fMRI data in human striate and extrastriate visual cortex. , 2001, Cerebral cortex.

[40]  R. Desimone,et al.  Stimulus-selective properties of inferior temporal neurons in the macaque , 1984, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[41]  D. B. Bender,et al.  Visual properties of neurons in inferotemporal cortex of the Macaque. , 1972, Journal of neurophysiology.

[42]  Alexander G. Huth,et al.  Attention During Natural Vision Warps Semantic Representation Across the Human Brain , 2013, Nature Neuroscience.

[43]  J. P. Jones,et al.  An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[44]  Michael Eickenberg,et al.  Data-driven HRF estimation for encoding and decoding models , 2014, NeuroImage.

[45]  David D. Cox,et al.  Do we understand high-level vision? , 2014, Current Opinion in Neurobiology.

[46]  Keiji Tanaka,et al.  Neuronal selectivities to complex object features in the ventral visual pathway of the macaque cerebral cortex. , 1994, Journal of neurophysiology.

[47]  A. Roe,et al.  Functional organization for color and orientation in macaque V4 , 2010, Nature Neuroscience.

[48]  Bryan R. Conroy,et al.  A Common, High-Dimensional Model of the Representational Space in Human Ventral Temporal Cortex , 2011, Neuron.

[49]  Ryan J. Prenger,et al.  Bayesian Reconstruction of Natural Images from Human Brain Activity , 2009, Neuron.

[50]  Kalanit Grill-Spector,et al.  Object Categorization: What Has fMRI Taught Us About Object Recognition? , 2009 .

[51]  Gidon Felsen,et al.  A natural approach to studying vision , 2005, Nature Neuroscience.

[52]  P A Robinson,et al.  Spatiotemporal hemodynamic response functions derived from physiology. , 2014, Journal of theoretical biology.

[53]  Jean-Baptiste Poline,et al.  Inverse retinotopy: Inferring the visual content of images from brain activation patterns , 2006, NeuroImage.

[54]  Nikola T. Markov,et al.  Anatomy of hierarchy: Feedforward and feedback pathways in macaque visual cortex , 2013, The Journal of comparative neurology.

[55]  Jitendra Malik,et al.  Pixels to Voxels: Modeling Visual Representation in the Human Brain , 2014, ArXiv.

[56]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[57]  Nicole C. Rust,et al.  Do We Know What the Early Visual System Does? , 2005, The Journal of Neuroscience.

[58]  Jack L. Gallant,et al.  A Continuous Semantic Space Describes the Representation of Thousands of Object and Action Categories across the Human Brain , 2012, Neuron.

[59]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[60]  Tomaso Poggio,et al.  Fast Readout of Object Identity from Macaque Inferior Temporal Cortex , 2005, Science.

[61]  Daniel L. K. Yamins,et al.  Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..

[62]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[63]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[64]  S. Edelman,et al.  Human Brain Mapping 6:316–328(1998) � A Sequence of Object-Processing Stages Revealed by fMRI in the Human Occipital Lobe , 2022 .

[65]  Dhiraj Joshi,et al.  Object Categorization: Computer and Human Vision Perspectives , 2008 .

[66]  J. Gallant,et al.  Identifying natural images from human brain activity , 2008, Nature.

[67]  M. Mesulam,et al.  From sensation to cognition. , 1998, Brain : a journal of neurology.

[68]  D. Norris Principles of magnetic resonance assessment of brain function , 2006, Journal of magnetic resonance imaging : JMRI.