Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence

The complex multi-stage architecture of cortical visual pathways provides the neural basis for efficient visual object recognition in humans. However, the stage-wise computations therein remain poorly understood. Here, we compared temporal (magnetoencephalography) and spatial (functional MRI) visual brain representations with representations in an artificial deep neural network (DNN) tuned to the statistics of real-world visual recognition. We showed that the DNN captured the stages of human visual processing in both time and space from early visual areas towards the dorsal and ventral streams. Further investigation of crucial DNN parameters revealed that while model architecture was important, training on real-world categorization was necessary to enforce spatio-temporal hierarchical relationships with the brain. Together our results provide an algorithmically informed view on the spatio-temporal dynamics of visual object recognition in the human visual brain.

[1]  P. Grobstein Analysis of Visual Behavior, David J. Ingle, Melvyn A. Goodale, Richard J.W. Mansfield (Eds.). MIT press, Cambridge, MA and London (1982), 834 , 1983 .

[2]  Alexander Borst,et al.  How does Nature Program Neuron Types? , 2008, Front. Neurosci..

[3]  Jitendra Malik,et al.  Pixels to Voxels: Modeling Visual Representation in the Human Brain , 2014, ArXiv.

[4]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[5]  David C. Van Essen,et al.  Multiple processing streams in occipitotemporal visual cortex , 1994, Nature.

[6]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[7]  D. J. Felleman,et al.  Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.

[8]  Dimitrios Pantazis,et al.  Can visual information encoded in cortical columns be decoded from magnetoencephalography data in humans? , 2015, NeuroImage.

[9]  C. Koch,et al.  Latency and Selectivity of Single Neurons Indicate Hierarchical Processing in the Human Medial Temporal Lobe , 2008, The Journal of Neuroscience.

[10]  Bolei Zhou,et al.  Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[11]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[12]  J. Bullier Integrated model of visual processing , 2001, Brain Research Reviews.

[13]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[14]  Lloyd T. Elliott,et al.  Cortical surface-based searchlight decoding , 2011, NeuroImage.

[15]  S. Kastner,et al.  Two hierarchically organized neural systems for object information in human visual cortex , 2008, Nature Neuroscience.

[16]  S. Hochstein,et al.  The reverse hierarchy theory of visual perceptual learning , 2004, Trends in Cognitive Sciences.

[17]  Marcel A. J. van Gerven,et al.  Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream , 2014, The Journal of Neuroscience.

[18]  L. Tyler,et al.  Predicting the Time Course of Individual Objects with MEG , 2014, Cerebral cortex.

[19]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[20]  Anders M. Dale,et al.  Cortical Surface-Based Analysis I. Segmentation and Surface Reconstruction , 1999, NeuroImage.

[21]  James J. DiCarlo,et al.  How Does the Brain Solve Visual Object Recognition? , 2012, Neuron.

[22]  T. Poggio,et al.  Neural mechanisms of object recognition , 2002, Current Opinion in Neurobiology.

[23]  Daniel L. K. Yamins,et al.  Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..

[24]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[25]  N. Kanwisher,et al.  Cortical Regions Involved in Perceiving Object Shape , 2000, The Journal of Neuroscience.

[26]  Dimitrios Pantazis,et al.  Similarity-Based Fusion of MEG and fMRI Reveals Spatio-Temporal Dynamics in Human Cortex During Visual Object Recognition , 2015, bioRxiv.

[27]  S. Edelman,et al.  Differential Processing of Objects under Various Viewing Conditions in the Human Lateral Occipital Complex , 1999, Neuron.

[28]  Radoslaw Martin Cichy,et al.  Resolving human object recognition in space and time , 2014, Nature Neuroscience.

[29]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[30]  Eric T. Carlson,et al.  A neural code for three-dimensional object shape in macaque inferotemporal cortex , 2008, Nature Neuroscience.

[31]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[32]  C. Connor,et al.  Neural representations for object perception: structure, category, and adaptive coding. , 2011, Annual review of neuroscience.

[33]  M. Goodale,et al.  The visual brain in action , 1995 .

[34]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[35]  Dimitrios Pantazis,et al.  Dynamics of scene representations in the human brain revealed by magnetoencephalography and deep neural networks , 2015, NeuroImage.

[36]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[37]  Dimitrios Pantazis,et al.  Dynamics of scene representations in the human brain revealed by magnetoencephalography and deep neural networks , 2015 .

[38]  Rainer Goebel,et al.  Information-based functional brain mapping. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[39]  J. DiCarlo,et al.  Learning and neural plasticity in visual object recognition , 2006, Current Opinion in Neurobiology.

[40]  Zhenghao Chen,et al.  On Random Weights and Unsupervised Feature Learning , 2011, ICML.

[41]  S. Taulu,et al.  Suppression of Interference and Artifacts by the Signal Space Separation Method , 2003, Brain Topography.

[42]  J. Gallant,et al.  Spectral receptive field properties explain shape selectivity in area V4. , 2006, Journal of neurophysiology.

[43]  Omar H. Butt,et al.  The Retinotopic Organization of Striate Cortex Is Well Predicted by Surface Topology , 2012, Current Biology.

[44]  Liang Wang,et al.  Probabilistic Maps of Visual Topography in Human Cortex. , 2015, Cerebral cortex.

[45]  Keiji Tanaka,et al.  Optical Imaging of Functional Organization in the Monkey Inferotemporal Cortex , 1996, Science.

[46]  Svetlana S. Georgieva,et al.  Using Functional Magnetic Resonance Imaging to Assess Adaptation and Size Invariance of Shape Processing by Humans and Monkeys , 2005, The Journal of Neuroscience.

[47]  S. Taulu,et al.  Spatiotemporal signal space separation method for rejecting nearby interference in MEG measurements , 2006, Physics in medicine and biology.

[48]  Alex Martin,et al.  Representation of Manipulable Man-Made Objects in the Dorsal Stream , 2000, NeuroImage.

[49]  Dwight J. Kravitz,et al.  A new neural framework for visuospatial processing , 2011, Nature Reviews Neuroscience.

[50]  A. Leventhal,et al.  Signal timing across the macaque visual system. , 1998, Journal of neurophysiology.

[51]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[52]  G. Orban,et al.  Coding of Shape and Position in Macaque Lateral Intraparietal Area , 2008, The Journal of Neuroscience.

[53]  Nikolaus Kriegeskorte,et al.  Frontiers in Systems Neuroscience Systems Neuroscience , 2022 .

[54]  Denis Fize,et al.  Speed of processing in the human visual system , 1996, Nature.

[55]  Ryan J. Prenger,et al.  Bayesian Reconstruction of Natural Images from Human Brain Activity , 2009, Neuron.

[56]  Doris Y. Tsao,et al.  A face feature space in the macaque temporal lobe , 2009, Nature Neuroscience.