Convolutional Neural Networks as a Model of the Visual System: Past, Present, and Future

Convolutional neural networks (CNNs) were inspired by early findings in the study of biological vision. They have since become successful tools in computer vision and state-of-the-art models of both neural activity and behavior on visual tasks. This review highlights what, in the context of CNNs, it means to be a good model in computational neuroscience and the various ways models can provide insight. Specifically, it covers the origins of CNNs and the methods by which we validate them as models of biological vision. It then goes on to elaborate on what we can learn about biological vision by understanding and experimenting on CNNs and discusses emerging opportunities for the use of CNNs in vision research beyond basic object recognition.

[1]  Bradley C. Love,et al.  Deep Networks as Models of Human and Animal Categorization , 2017, CogSci.

[2]  Alexander S. Ecker,et al.  How well do deep neural networks trained on object recognition characterize the mouse visual system , 2019 .

[3]  Yizhen Zhang,et al.  Deep Recurrent Neural Network Reveals a Hierarchy of Process Memory during Dynamic Natural Vision , 2017, bioRxiv.

[4]  Nikolaus Kriegeskorte,et al.  Recurrent networks can recycle neural resources to flexibly trade speed for accuracy in visual recognition , 2019, 2019 Conference on Cognitive Computational Neuroscience.

[5]  Aran Nayebi,et al.  Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs , 2019, NeurIPS.

[6]  S. Barnett,et al.  Philosophical Transactions of the Royal Society A : Mathematical , 2017 .

[7]  David Cox,et al.  Recurrent computations for visual pattern completion , 2017, Proceedings of the National Academy of Sciences.

[8]  Ha Hong,et al.  Explicit information for category-orthogonal object properties increases along the ventral stream , 2016, Nature Neuroscience.

[9]  Daniel L. K. Yamins,et al.  A Task-Optimized Neural Network Replicates Human Auditory Behavior, Predicts Brain Responses, and Reveals a Cortical Processing Hierarchy , 2018, Neuron.

[10]  Nicolas Heess,et al.  Hierarchical visuomotor control of humanoids , 2018, ICLR.

[11]  Matthias Bethge,et al.  Methods and measurements to compare men against machines , 2017, HVEI.

[12]  Reza Ebrahimpour,et al.  Feedforward object-vision models only tolerate small image variations compared to human , 2014, Front. Comput. Neurosci..

[13]  Dimitrios Pantazis,et al.  Dynamics of scene representations in the human brain revealed by magnetoencephalography and deep neural networks , 2015, NeuroImage.

[14]  Christof Koch,et al.  A large-scale, standardized physiological survey reveals higher order coding throughout the mouse visual cortex , 2018, bioRxiv.

[15]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[16]  Heiko Neumann,et al.  Incorporating Feedback in Convolutional Neural Networks , 2019, 2019 Conference on Cognitive Computational Neuroscience.

[17]  David Sussillo,et al.  Opening the Black Box: Low-Dimensional Dynamics in High-Dimensional Recurrent Neural Networks , 2013, Neural Computation.

[18]  Leon A. Gatys,et al.  Deep convolutional models improve predictions of macaque V1 responses to natural images , 2019, PLoS Comput. Biol..

[19]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[20]  Uri Hasson,et al.  Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks , 2019, Neuron.

[21]  Nikolaus Kriegeskorte,et al.  Representational Similarity Analysis – Connecting the Branches of Systems Neuroscience , 2008, Frontiers in systems neuroscience.

[22]  Hongjing Lu,et al.  Deep convolutional networks do not classify based on global object shape , 2018, PLoS Comput. Biol..

[23]  Jakob H. Macke,et al.  Analyzing biological and artificial neural networks: challenges with opportunities for synergy? , 2018, Current Opinion in Neurobiology.

[24]  Sergio Gomez Colmenarejo,et al.  One-Shot High-Fidelity Imitation: Training Large-Scale Deep Nets with RL , 2018, ArXiv.

[25]  Matthias Bethge,et al.  Comparing deep neural networks against humans: object recognition when the signal gets weaker , 2017, ArXiv.

[26]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[27]  Surya Ganguli,et al.  A Unified Theory Of Early Visual Representations From Retina To Cortex Through Anatomically Constrained Deep CNNs , 2019, bioRxiv.

[28]  Aran Nayebi,et al.  Self-supervised Neural Network Models of Higher Visual Cortex Development , 2019, 2019 Conference on Cognitive Computational Neuroscience.

[29]  Chris I. Baker,et al.  Similarity judgments and cortical visual responses reflect different properties of object and scene categories in naturalistic images , 2018, NeuroImage.

[30]  Yizhen Zhang,et al.  Deep Recurrent Neural Network Reveals a Hierarchy of Process Memory during Dynamic Natural Vision , 2017 .

[31]  J. DiCarlo,et al.  Using goal-driven deep learning models to understand sensory cortex , 2016, Nature Neuroscience.

[32]  M. Fahle,et al.  Limited translation invariance of human visual pattern recognition , 1998, Perception & psychophysics.

[33]  J Brendan Ritchie,et al.  The Ventral Visual Pathway Represents Animal Appearance over Animacy, Unlike Human Behavior and Deep Neural Networks , 2019, The Journal of Neuroscience.

[34]  Jonas Kubilius,et al.  Brain-Score: Which Artificial Neural Network for Object Recognition is most Brain-Like? , 2018, bioRxiv.

[35]  Jonas Kubilius,et al.  Deep Neural Networks as a Computational Model for Human Shape Sensitivity , 2016, PLoS Comput. Biol..

[36]  Kendrick N. Kay,et al.  Principles for models of neural information processing , 2017, NeuroImage.

[37]  Aude Oliva,et al.  Population response magnitude variation in inferotemporal cortex predicts image memorability , 2019, eLife.

[38]  Kurt Hornik,et al.  Neural Network Models , 2011 .

[39]  Walter J. Scheirer,et al.  Using human brain activity to guide machine learning , 2017, Scientific Reports.

[40]  Kenneth D Miller,et al.  How biological attention mechanisms improve task performance in a large-scale visual system model , 2017, bioRxiv.

[41]  Nikolaus Kriegeskorte,et al.  The spatiotemporal neural dynamics underlying perceived similarity for real-world objects , 2019, NeuroImage.

[42]  Ian Palmer,et al.  Why Are Face and Object Processing Segregated in the Human Brain? Testing Computational Hypotheses with Deep Convolutional Neural Networks , 2019, 2019 Conference on Cognitive Computational Neuroscience.

[43]  Eric Shea-Brown,et al.  A large-scale standardized physiological survey reveals functional organization of the mouse visual cortex , 2019, Nature Neuroscience.

[44]  Chengxu Zhuang,et al.  Local Aggregation for Unsupervised Learning of Visual Embeddings , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[45]  Nikolaus Kriegeskorte,et al.  Deep Learning for Cognitive Neuroscience , 2019, ArXiv.

[46]  Gabriel Kreiman,et al.  Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning , 2016, ICLR.

[47]  Thomas Serre,et al.  Disentangling neural mechanisms for perceptual grouping , 2019, ICLR.

[48]  Thomas L. Griffiths,et al.  Evaluating (and Improving) the Correspondence Between Deep Neural Networks and Human Representations , 2017, Cogn. Sci..

[49]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[50]  Yoshua Bengio,et al.  How can deep learning advance computational modeling of sensory information processing? , 2018, ArXiv.

[51]  James J DiCarlo,et al.  Neural population control via deep image synthesis , 2018, Science.

[52]  Naftali Tishby,et al.  Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).

[53]  Marcel van Gerven,et al.  The functional role of cue-driven feature-based feedback in object recognition , 2019, ArXiv.

[54]  Chris I. Baker,et al.  Similarity judgments and cortical visual responses reflect different properties of object and scene categories in naturalistic images , 2019, NeuroImage.

[55]  Surya Ganguli,et al.  Deep Learning Models of the Retinal Response to Natural Scenes , 2017, NIPS.

[56]  Haim Sompolinsky,et al.  Separability and geometry of object manifolds in deep neural networks , 2019, Nature Communications.

[57]  Misha Denil,et al.  Predicting Parameters in Deep Learning , 2014 .

[58]  Surya Ganguli,et al.  A deep learning framework for neuroscience , 2019, Nature Neuroscience.

[59]  Matthias Bethge,et al.  Generalisation in humans and deep neural networks , 2018, NeurIPS.

[60]  Nikolaus Kriegeskorte,et al.  Deep neural networks: a new framework for modelling biological vision and brain information processing , 2015, bioRxiv.

[61]  Geoffrey E. Hinton,et al.  Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures , 2018, NeurIPS.

[62]  N. Kriegeskorte,et al.  Neural network models and deep learning , 2019, Current Biology.

[63]  Deepak Khosla,et al.  Spiking Deep Convolutional Neural Networks for Energy-Efficient Object Recognition , 2014, International Journal of Computer Vision.

[64]  Stefania Bracci,et al.  Orthogonal Representations of Object Shape and Category in Deep Convolutional Neural Networks and Human Visual Cortex , 2019, Scientific Reports.

[65]  Surya Ganguli,et al.  Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.

[66]  Thomas Serre,et al.  Deep Learning: The Good, the Bad, and the Ugly. , 2019, Annual review of vision science.

[67]  Thomas Serre,et al.  A quantitative theory of immediate visual recognition. , 2007, Progress in brain research.

[68]  Zenghui Wang,et al.  Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review , 2017, Neural Computation.

[69]  Adam Gaier,et al.  Weight Agnostic Neural Networks , 2019, NeurIPS.

[70]  Odelia Schwartz,et al.  Stimulus- and goal-oriented frameworks for understanding natural vision , 2018, Nature Neuroscience.

[71]  Noemi Montobbio,et al.  KerCNNs: biologically inspired lateral connections for classification of corrupted images , 2019, ArXiv.

[72]  Alex Clarke,et al.  Integrated deep visual and semantic attractor neural networks predict fMRI pattern-information along the ventral object processing pathway , 2018, Scientific Reports.

[73]  James C. R. Whittington,et al.  Theories of Error Back-Propagation in the Brain , 2019, Trends in Cognitive Sciences.

[74]  Junxing Shi,et al.  Deep Residual Network Predicts Cortical Representation and Organization of Visual Features for Rapid Categorization , 2018, Scientific Reports.

[75]  Stéphane Mallat,et al.  Understanding deep convolutional networks , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[76]  Leon A. Gatys,et al.  Diverse feature visualizations reveal invariances in early layers of deep neural networks , 2018, ECCV.

[77]  Elijah D. Christensen,et al.  Using deep learning to probe the neural code for images in primary visual cortex , 2019, Journal of vision.

[78]  Aaron R. Seitz,et al.  Deep Neural Networks for Modeling Visual Perceptual Learning , 2018, The Journal of Neuroscience.

[79]  Timothée Masquelier,et al.  Deep Networks Can Resemble Human Feed-forward Vision in Invariant Object Recognition , 2015, Scientific Reports.

[80]  S. Bohté,et al.  Visual pathways from the perspective of cost functions and multi-task deep neural networks , 2017, Cortex.

[81]  Klaus-Robert Müller,et al.  Towards Explainable Artificial Intelligence , 2019, Explainable AI.

[82]  Nikolaus Kriegeskorte,et al.  Deep Neural Networks in Computational Neuroscience , 2019 .

[83]  John K. Tsotsos,et al.  Totally Looks Like - How Humans Compare, Compared to Machines , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[84]  Tomaso A. Poggio,et al.  Invariant recognition drives neural representations of action sequences , 2017, PLoS Comput. Biol..

[85]  Katherine R. Storrs,et al.  Deep Convolutional Neural Networks Outperform Feature-Based But Not Categorical Models in Explaining Object Similarity Judgments , 2017, Front. Psychol..

[86]  Matthew Botvinick,et al.  On the importance of single directions for generalization , 2018, ICLR.

[87]  N Apurva Ratan Murty,et al.  Multiplicative mixing of object identity and image attributes in single inferior temporal neurons , 2018, Proceedings of the National Academy of Sciences.

[88]  Harish Katti,et al.  Do deep neural networks see the way we do? , 2019, bioRxiv.

[89]  Nicolas Pinto,et al.  Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[90]  Matthias Bethge,et al.  ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.

[91]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[92]  Marcel van Gerven,et al.  Convolutional neural network-based encoding and decoding of visual object recognition in space and time , 2017, NeuroImage.

[93]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[94]  Bryan P. Tripp,et al.  Similarities and differences between stimulus tuning in the inferotemporal visual cortex and convolutional networks , 2016, 2017 International Joint Conference on Neural Networks (IJCNN).

[95]  Jim Williams,et al.  What Does It Mean? , 1907, California state journal of medicine.

[96]  Hamid Aghajan,et al.  Surround Modulation: A Bio-inspired Connectivity Structure for Convolutional Neural Networks , 2019, NeurIPS.

[97]  Daniel D. Lee,et al.  Classification and Geometry of General Perceptual Manifolds , 2017, Physical Review X.

[98]  Wojciech Zaremba,et al.  Deep Neural Networks Predict Category Typicality Ratings for Images , 2015, CogSci.

[99]  Riegeskorte CONTROVERSIAL STIMULI: PITTING NEURAL NETWORKS AGAINST EACH OTHER AS MODELS OF HUMAN RECOGNITION , 2019 .

[100]  Nando de Freitas,et al.  Reinforcement and Imitation Learning for Diverse Visuomotor Skills , 2018, Robotics: Science and Systems.

[101]  Martin Wattenberg,et al.  Do Neural Networks Show Gestalt Phenomena? An Exploration of the Law of Closure , 2019, ArXiv.

[102]  Roland W Fleming,et al.  Learning to see stuff , 2019, Current Opinion in Behavioral Sciences.

[103]  Nikolaus Kriegeskorte,et al.  Neural dynamics of real-world object vision that guide behaviour , 2017, bioRxiv.

[104]  Konrad P. Körding,et al.  What does it mean to understand a neural network? , 2019, ArXiv.

[105]  Kshitij Dwivedi,et al.  Task-specific vision models explain task-specific areas of visual cortex , 2018 .

[106]  Marcel A. J. van Gerven,et al.  Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream , 2014, The Journal of Neuroscience.

[107]  Michael Carbin,et al.  The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.

[108]  James J DiCarlo,et al.  Large-Scale, High-Resolution Comparison of the Core Visual Object Recognition Behavior of Humans, Monkeys, and State-of-the-Art Deep Artificial Neural Networks , 2018, The Journal of Neuroscience.

[109]  Yalda Mohsenzadeh,et al.  Beyond Core Object Recognition: Recurrent processes account for object recognition under occlusion , 2019, PLoS Comput. Biol..

[110]  David D. Cox,et al.  Opinion TRENDS in Cognitive Sciences Vol.11 No.8 Untangling invariant object recognition , 2022 .

[111]  Ruslan Salakhutdinov,et al.  Learning Deep Generative Models , 2009 .

[112]  Daniel B. Rubin,et al.  The Stabilized Supralinear Network: A Unifying Circuit Motif Underlying Multi-Input Integration in Sensory Cortex , 2015, Neuron.

[113]  Alexander Borst,et al.  How does Nature Program Neuron Types? , 2008, Front. Neurosci..

[114]  Jason Yosinski,et al.  Understanding Neural Networks via Feature Visualization: A survey , 2019, Explainable AI.

[115]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[116]  Saumik Bhattacharya,et al.  Effects of Degradations on Deep Neural Network Architectures , 2018, ArXiv.

[117]  N Apurva Ratan Murty,et al.  A Balanced Comparison of Object Invariances in Monkey IT Neurons , 2017, eNeuro.

[118]  Deborah Silver,et al.  Feature Visualization , 1994, Scientific Visualization.

[119]  Abhinav Gupta,et al.  BOLD5000, a public fMRI dataset while viewing 5000 visual images , 2018, Scientific Data.

[120]  Takeo Watanabe,et al.  Perceptual learning rules based on reinforcers and attention , 2010, Trends in Cognitive Sciences.

[121]  Alexander S. Ecker,et al.  Stimulus domain transfer in recurrent models for large scale cortical population prediction on video , 2018, NeurIPS.

[122]  Shriram K. Vasudevan,et al.  The Deep Learning Framework , 2021, Deep Learning.

[123]  David J. Jilk,et al.  Early recurrent feedback facilitates visual object recognition under challenging conditions , 2014, Front. Psychol..

[124]  Russell A. Epstein,et al.  Computational mechanisms underlying cortical responses to the affordance properties of visual scenes , 2017, bioRxiv.

[125]  A. Kitaoka,et al.  Illusory Motion Reproduced by Deep Neural Networks Trained for Prediction , 2018, Front. Psychol..

[126]  Yoshua Bengio,et al.  Dendritic cortical microcircuits approximate the backpropagation algorithm , 2018, NeurIPS.

[127]  Jascha Sohl-Dickstein,et al.  Input Switched Affine Networks: An RNN Architecture Designed for Interpretability , 2016, ICML.

[128]  James J. DiCarlo,et al.  Evidence that recurrent circuits are critical to the ventral stream’s execution of core object recognition behavior , 2018, Nature Neuroscience.

[129]  T. Poggio,et al.  A model of V4 shape selectivity and invariance. , 2007, Journal of neurophysiology.

[130]  Nikolaus Kriegeskorte,et al.  Recurrent Convolutional Neural Networks: A Better Model of Biological Object Recognition , 2017, bioRxiv.

[131]  Giulio Matteucci,et al.  Nonlinear Processing of Shape Information in Rat Lateral Extrastriate Cortex , 2018, The Journal of Neuroscience.

[132]  Guohua Shen,et al.  Deep image reconstruction from human brain activity , 2017, bioRxiv.

[133]  Jiaxing Zhang,et al.  Attentional Neural Network: Feature Selection Using Cognitive Feedback , 2014, NIPS.

[134]  Eero P. Simoncelli,et al.  Natural image statistics and neural representation. , 2001, Annual review of neuroscience.

[135]  Tomaso Poggio,et al.  Computational Models of Object Recognition in Cortex: A Review , 2000 .

[136]  Bolei Zhou,et al.  Revisiting the Importance of Individual Units in CNNs via Ablation , 2018, ArXiv.

[137]  Michael B. Reiser,et al.  A Connectome Based Hexagonal Lattice Convolutional Network Model of the Drosophila Visual System , 2018, ArXiv.

[138]  Leon A. Gatys,et al.  Learning divisive normalization in primary visual cortex , 2019, bioRxiv.

[139]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[140]  Omri Barak,et al.  Recurrent neural networks as versatile tools of neuroscience research , 2017, Current Opinion in Neurobiology.

[141]  Michael Eickenberg,et al.  Seeing it all: Convolutional network layers map the function of the human visual system , 2017, NeuroImage.

[142]  Surya Ganguli,et al.  Deep learning models reveal internal structure and diverse computations in the retina under natural scenes , 2018, bioRxiv.

[143]  Geoffrey E. Hinton,et al.  Similarity of Neural Network Representations Revisited , 2019, ICML.

[144]  Kenneth D. Miller,et al.  A simple circuit model of visual cortex explains neural and behavioral aspects of attention , 2019 .