Convolutional neural networks for vision neuroscience: significance, developments, and outstanding issues

Convolutional Neural Networks (CNN) are a class of machine learning models predominately used in computer vision tasks and can achieve human-like performance through learning from experience. Their striking similarities to the structural and functional principles of the primate visual system allow for comparisons between these artificial networks and their biological counterparts, enabling exploration of how visual functions and neural representations may emerge in the real brain from a limited set of computational principles. After considering the basic features of CNNs, we discuss the opportunities and challenges of endorsing CNNs as in silico models of the primate visual system. Specifically, we highlight several emerging notions about the anatomical and physiological properties of the visual system that still need to be systematically integrated into current CNN models. These tenets include the implementation of parallel processing pathways from the early stages of retinal input and the reconsideration of several assumptions concerning the serial progression of information flow. We suggest design choices and architectural constraints that could facilitate a closer alignment with biology provide causal evidence of the predictive link between the artificial and biological visual systems. Adopting this principled perspective could potentially lead to new research questions and applications of CNNs beyond modeling object recognition.

[1]  D. Burr,et al.  Vision: Neuronal mechanisms enabling stable perception , 2022, Current Biology.

[2]  M. Corbetta,et al.  The subcortical and neurochemical organization of the ventral and dorsal attention networks , 2022, Communications Biology.

[3]  M. Tamietto,et al.  A deep neural network model of the primate superior colliculus for emotion recognition , 2022, Philosophical Transactions of the Royal Society B.

[4]  Laurent Udo Perrinet,et al.  Pooling strategies in V1 can account for the functional and structural diversity across species , 2022, PLoS Comput. Biol..

[5]  M. Behrmann,et al.  Face perception: computational insights from phylogeny , 2022, Trends in Cognitive Sciences.

[6]  Leonard E. van Dyck,et al.  Comparing Object Recognition in Humans and Deep Convolutional Neural Networks—An Eye Tracking Study , 2021, Frontiers in Neuroscience.

[7]  Christopher C. Pack,et al.  Your head is there to move you around: Goal-driven models of the primate dorsal pathway , 2021, bioRxiv.

[8]  Peter E. Latham,et al.  Towards Biologically Plausible Convolutional Networks , 2021, NeurIPS.

[9]  Ethan K. Scott,et al.  The tectum/superior colliculus as the vertebrate solution for spatial sensory integration and action , 2021, Current Biology.

[10]  W. Kwan,et al.  Visual Cortical Area MT Is Required for Development of the Dorsal Stream and Associated Visuomotor Behaviors , 2021, The Journal of Neuroscience.

[11]  M. Bickford,et al.  Unraveling circuits of visual perception and cognition through the superior colliculus , 2021, Neuron.

[12]  Fahad Shahbaz Khan,et al.  Transformers in Vision: A Survey , 2021, ACM Comput. Surv..

[13]  Leslie G. Ungerleider,et al.  Evidence for a Third Visual Pathway Specialized for Social Perception , 2020, Trends in Cognitive Sciences.

[14]  Andrew M. Saxe,et al.  If deep learning is the answer, what is the question? , 2020, Nature Reviews Neuroscience.

[15]  M. Häusser,et al.  How many neurons are sufficient for perception of cortical activity? , 2020, eLife.

[16]  Kaushik Roy,et al.  Quantifying the Brain Predictivity of Artificial Neural Networks With Nonlinear Response Mapping , 2020, bioRxiv.

[17]  Pedro A. M. Mediano,et al.  A synergistic core for human brain evolution and cognition , 2020, bioRxiv.

[18]  Doris Y. Tsao,et al.  Combining brain perturbation and neuroimaging in non-human primates , 2020, NeuroImage.

[19]  W. Vanduffel,et al.  Exciting inhibition in primates , 2020, eLife.

[20]  Nancy Kanwisher,et al.  Artificial Neural Networks Accurately Predict Language Processing in the Brain , 2020 .

[21]  L. Carretié,et al.  The Missing Link in Early Emotional Processing , 2020, Emotion Review.

[22]  R. Dolan,et al.  The influence of subcortical shortcuts on disordered sensory and cognitive processing , 2020, Nature Reviews Neuroscience.

[23]  R. Dolan,et al.  The influence of subcortical shortcuts on disordered sensory and cognitive processing , 2020, Nature Reviews Neuroscience.

[24]  Theodore L. Willke,et al.  Topological limits to the parallel processing capability of network architectures , 2020, Nature Physics.

[25]  Grace W. Lindsay Convolutional Neural Networks as a Model of the Visual System: Past, Present, and Future , 2020, Journal of Cognitive Neuroscience.

[26]  Luca Bertinetto,et al.  Making Better Mistakes: Leveraging Class Hierarchies With Deep Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Shih-Chieh Chang,et al.  Learning with Hierarchical Complement Objective , 2019, ArXiv.

[28]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[29]  T. Masquelier,et al.  Sub-Optimality of the Early Visual System Explained Through Biologically Plausible Plasticity , 2019, bioRxiv.

[30]  Surya Ganguli,et al.  A deep learning framework for neuroscience , 2019, Nature Neuroscience.

[31]  Valentin Dragoi,et al.  Synergistic Coding of Visual Information in Columnar Networks , 2019, Neuron.

[32]  Alexander S. Ecker,et al.  Inception loops discover what excites neurons most using deep predictive models , 2019, Nature Neuroscience.

[33]  Uri Hasson,et al.  Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks , 2019, Neuron.

[34]  Carlos R. Ponce,et al.  Evolving Images for Visual Neurons Using a Deep Generative Network Reveals Coding Principles and Neuronal Preferences , 2019, Cell.

[35]  M. Tamietto,et al.  Functional neuroanatomy of blindsight revealed by activation likelihood estimation meta-analysis , 2019, Neuropsychologia.

[36]  Nikolaus Kriegeskorte,et al.  Recurrence is required to capture the representational dynamics of the human visual system , 2019, Proceedings of the National Academy of Sciences.

[37]  N. Kriegeskorte,et al.  Neural network models and deep learning , 2019, Current Biology.

[38]  Liam Paninski,et al.  Reinforcement Learning Recruits Somata and Apical Dendrites across Layers of Primary Sensory Cortex , 2019, Cell reports.

[39]  Mitsuo Kawato,et al.  Opportunities and challenges for a maturing science of consciousness , 2019, Nature Human Behaviour.

[40]  H. Onoe,et al.  Dissecting the circuit for blindsight to reveal the critical role of pulvinar and superior colliculus , 2019, Nature Communications.

[41]  Marta I Garrido,et al.  An afferent white matter pathway from the pulvinar to the amygdala facilitates fear recognition , 2019, eLife.

[42]  Surya Ganguli,et al.  A Unified Theory Of Early Visual Representations From Retina To Cortex Through Anatomically Constrained Deep CNNs , 2019, bioRxiv.

[43]  Tor D. Wager,et al.  Emotion schemas are embedded in the human visual system , 2018, Science Advances.

[44]  Pieter R. Roelfsema,et al.  A Biologically Plausible Learning Rule for Deep Learning in the Brain , 2018, ArXiv.

[45]  James J DiCarlo,et al.  Neural population control via deep image synthesis , 2018, Science.

[46]  Jakob H. Macke,et al.  Analyzing biological and artificial neural networks: challenges with opportunities for synergy? , 2018, Current Opinion in Neurobiology.

[47]  Yoshua Bengio,et al.  Dendritic cortical microcircuits approximate the backpropagation algorithm , 2018, NeurIPS.

[48]  N. Kanwisher,et al.  How face perception unfolds over time , 2018, Nature Communications.

[49]  Jonas Kubilius,et al.  Brain-Score: Which Artificial Neural Network for Object Recognition is most Brain-Like? , 2018, bioRxiv.

[50]  Bolei Zhou,et al.  Interpreting Visual Representations of Neural Networks via Network Dissection , 2018, Journal of Vision.

[51]  Matthias Bethge,et al.  Generalisation in humans and deep neural networks , 2018, NeurIPS.

[52]  H. Bridge,et al.  Blindsight relies on a functional connection between hMT+ and the lateral geniculate nucleus, not the pulvinar , 2018, PLoS biology.

[53]  James J. DiCarlo,et al.  Evidence that recurrent circuits are critical to the ventral stream’s execution of core object recognition behavior , 2018, Nature Neuroscience.

[54]  Daniel L. K. Yamins,et al.  A Task-Optimized Neural Network Replicates Human Auditory Behavior, Predicts Brain Responses, and Reveals a Cortical Processing Hierarchy , 2018, Neuron.

[55]  Timothée Masquelier,et al.  Deep Learning in Spiking Neural Networks , 2018, Neural Networks.

[56]  Leena E Williams,et al.  Higher-Order Thalamocortical Inputs Gate Synaptic Long-Term Potentiation via Disinhibition , 2018, Neuron.

[57]  Matthew Botvinick,et al.  On the importance of single directions for generalization , 2018, ICLR.

[58]  David A. Leopold,et al.  Visual Cortex: The Eccentric Area Prostriata in the Human Brain , 2018, Current Biology.

[59]  James J DiCarlo,et al.  Large-Scale, High-Resolution Comparison of the Core Visual Object Recognition Behavior of Humans, Monkeys, and State-of-the-Art Deep Artificial Neural Networks , 2018, The Journal of Neuroscience.

[60]  D. Levey Recognition , 2017, The Harps that Once....

[61]  Francesca Ugolotti Serventi,et al.  Cortical and subcortical connections of parietal and premotor nodes of the monkey hand mirror neuron network , 2017, Brain Structure and Function.

[62]  Leon A. Gatys,et al.  Deep convolutional models improve predictions of macaque V1 responses to natural images , 2017, bioRxiv.

[63]  A. Pegna,et al.  Affective blindsight relies on low spatial frequencies , 2017, Neuropsychologia.

[64]  Katie C. Bittner,et al.  Behavioral time scale synaptic plasticity underlies CA1 place fields , 2017, Science.

[65]  David C. Burr,et al.  Area Prostriata in the Human Brain , 2017, Current Biology.

[66]  Zenghui Wang,et al.  Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review , 2017, Neural Computation.

[67]  Yizhen Zhang,et al.  Deep Recurrent Neural Network Reveals a Hierarchy of Process Memory during Dynamic Natural Vision , 2017, bioRxiv.

[68]  Ashesh K Dhawale,et al.  The Role of Variability in Motor Learning. , 2017, Annual review of neuroscience.

[69]  Matthias Bethge,et al.  Comparing deep neural networks against humans: object recognition when the signal gets weaker , 2017, ArXiv.

[70]  David Cox,et al.  Recurrent computations for visual pattern completion , 2017, Proceedings of the National Academy of Sciences.

[71]  S. Bohté,et al.  Visual pathways from the perspective of cost functions and multi-task deep neural networks , 2017, Cortex.

[72]  Y. Rossetti,et al.  Rise and fall of the two visual systems theory. , 2017, Annals of physical and rehabilitation medicine.

[73]  Michael Eickenberg,et al.  Seeing it all: Convolutional network layers map the function of the human visual system , 2017, NeuroImage.

[74]  Philippe Kahane,et al.  Activations of deep convolutional neural networks are aligned with gamma band activity of human visual cortex , 2017, Communications Biology.

[75]  M. Morrone,et al.  Plasticity of Visual Pathways and Function in the Developing Brain: Is the Pulvinar a Crucial Player? , 2017, Front. Syst. Neurosci..

[76]  Matthias Bethge,et al.  Methods and measurements to compare men against machines , 2017, HVEI.

[77]  Marcel van Gerven,et al.  Increasingly complex representations of natural movies across the dorsal stream are shared between subjects , 2017, NeuroImage.

[78]  Rafal Bogacz,et al.  An Approximation of the Error Backpropagation Algorithm in a Predictive Coding Network with Local Hebbian Synaptic Plasticity , 2017, Neural Computation.

[79]  Colin J. Akerman,et al.  Random synaptic feedback weights support error backpropagation for deep learning , 2016, Nature Communications.

[80]  Bruno A. Olshausen,et al.  Emergence of foveal image sampling from learning to attend in visual scenes , 2016, ICLR.

[81]  Timothy P Lillicrap,et al.  Towards deep learning with segregated dendrites , 2016, eLife.

[82]  Alain Ptito,et al.  The superior colliculus is sensitive to gestalt-like stimulus configuration in hemispherectomy patients , 2016, Cortex.

[83]  Antonio Torralba,et al.  Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence , 2016, Scientific Reports.

[84]  Jonas Kubilius,et al.  Deep Neural Networks as a Computational Model for Human Shape Sensitivity , 2016, PLoS Comput. Biol..

[85]  J. DiCarlo,et al.  Using goal-driven deep learning models to understand sensory cortex , 2016, Nature Neuroscience.

[86]  Yoshua Bengio,et al.  Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation , 2016, Front. Comput. Neurosci..

[87]  H. Bridge,et al.  Adaptive Pulvinar Circuitry Supports Visual Cognition , 2016, Trends in Cognitive Sciences.

[88]  M. Morrone,et al.  Visual Plasticity: Blindsight Bridges Anatomy and Function in the Visual System , 2016, Current Biology.

[89]  Ting Liu,et al.  Recent advances in convolutional neural networks , 2015, Pattern Recognit..

[90]  Marco Tamietto,et al.  Visual imagery influences brain responses to visual stimulation in bilateral cortical blindness , 2015, Cortex.

[91]  Marco Tamietto,et al.  From affective blindsight to emotional consciousness , 2015, Consciousness and Cognition.

[92]  B. de Gelder,et al.  Looming sensitive cortical regions without V1 input: evidence from a patient with bilateral cortical blindness , 2015, Front. Integr. Neurosci..

[93]  Ariel Rokem,et al.  Human blindsight is mediated by an intact geniculo-extrastriate pathway , 2015, eLife.

[94]  Robert H. Wurtz,et al.  Using perturbations to identify the brain circuits underlying active vision , 2015, Philosophical Transactions of the Royal Society B: Biological Sciences.

[95]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[96]  M. Tamietto,et al.  Body Recognition in a Patient with Bilateral Primary Visual Cortex Lesions , 2015, Biological Psychiatry.

[97]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[98]  G. Egan,et al.  Preservation of Vision by the Pulvinar following Early-Life Primary Visual Cortex Lesions , 2015, Current Biology.

[99]  Geraint Rees,et al.  Motion area V5/MT+ response to global motion in the absence of V1 resembles early visual cortex , 2014, Brain : a journal of neurology.

[100]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[101]  Sho Yagishita,et al.  A critical time window for dopamine actions on the structural plasticity of dendritic spines , 2014, Science.

[102]  Reza Ebrahimpour,et al.  Feedforward object-vision models only tolerate small image variations compared to human , 2014, Front. Comput. Neurosci..

[103]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[104]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[105]  N. Kriegeskorte,et al.  Author ' s personal copy Representational geometry : integrating cognition , computation , and the brain , 2013 .

[106]  Dwight J. Kravitz,et al.  The ventral visual pathway: an expanded neural framework for the processing of object quality , 2013, Trends in Cognitive Sciences.

[107]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[108]  W. Kwan,et al.  The Early Maturation of Visual Cortical Area MT is Dependent on Input from the Retinorecipient Medial Portion of the Inferior Pulvinar , 2012, Journal of Neuroscience.

[109]  Alan Cowey,et al.  On the usefulness of ‘what’ and ‘where’ pathways in vision , 2011, Trends in Cognitive Sciences.

[110]  Alexander C. Schütz,et al.  Eye movements and perception: a selective review. , 2011, Journal of vision.

[111]  Dwight J. Kravitz,et al.  A new neural framework for visuospatial processing , 2011, Nature Reviews Neuroscience.

[112]  Paul B Hibbard,et al.  Consciousness of the first order in blindsight , 2010, Proceedings of the National Academy of Sciences.

[113]  L. Pessoa,et al.  Emotion processing and the amygdala: from a 'low road' to 'many roads' of evaluating biological significance , 2010, Nature Reviews Neuroscience.

[114]  David A. Leopold,et al.  Blindsight depends on the lateral geniculate nucleus , 2010, Nature.

[115]  J. Kwag,et al.  The timing of external input controls the sign of plasticity at local synapses , 2009, Nature Neuroscience.

[116]  Nikolaus Kriegeskorte,et al.  Relating Population-Code Representations between Man, Monkey, and Computational Models , 2009, Front. Neurosci..

[117]  Y. Dan,et al.  Spike timing-dependent plasticity: a Hebbian learning rule. , 2008, Annual review of neuroscience.

[118]  L. Isbell,et al.  Snakes as agents of evolutionary change in primate brains. , 2006, Journal of human evolution.

[119]  Mark H. Johnson Subcortical face processing , 2005, Nature Reviews Neuroscience.

[120]  Pieter R. Roelfsema,et al.  Attention-Gated Reinforcement Learning of Internal Representations for Classification , 2005, Neural Computation.

[121]  C. Koch,et al.  Invariant visual representation by single neurons in the human brain , 2005, Nature.

[122]  Bruno A Olshausen,et al.  Sparse coding of sensory inputs , 2004, Current Opinion in Neurobiology.

[123]  J. B. Levitt,et al.  Circuits for Local and Global Signal Integration in Primary Visual Cortex , 2002, The Journal of Neuroscience.

[124]  Konrad P. Körding,et al.  Supervised and Unsupervised Learning with Two Sites of Synaptic Integration , 2001, Journal of Computational Neuroscience.

[125]  J. Bullier Integrated model of visual processing , 2001, Brain Research Reviews.

[126]  Tomaso Poggio,et al.  Computational Models of Object Recognition in Cortex: A Review , 2000 .

[127]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[128]  D. C. Essen,et al.  Neural responses to polar, hyperbolic, and Cartesian gratings in area V4 of the macaque monkey. , 1996, Journal of neurophysiology.

[129]  Deborah Silver,et al.  Feature Visualization , 1994, Scientific Visualization.

[130]  P A Salin,et al.  Response selectivity of neurons in area MT of the macaque monkey during reversible inactivation of area V1. , 1992, Journal of neurophysiology.

[131]  M. Goodale,et al.  Separate visual pathways for perception and action , 1992, Trends in Neurosciences.

[132]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[133]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[134]  L. Maffei,et al.  The visual cortex as a spatial frequency analyser. , 1973, Vision research.

[135]  J. Bourne,et al.  The Evolution of Subcortical Pathways to the Extrastriate Cortex , 2020, Evolutionary Neuroscience.

[136]  Hamid Aghajan,et al.  Surround Modulation: A Bio-inspired Connectivity Structure for Convolutional Neural Networks , 2019, NeurIPS.

[137]  Heiko Neumann,et al.  Incorporating Feedback in Convolutional Neural Networks , 2019, 2019 Conference on Cognitive Computational Neuroscience.

[138]  T. Isa,et al.  Potential of Optogenetics for the Behavior Manipulation of Non-human Primates , 2015 .

[139]  Nikolaus Kriegeskorte,et al.  Frontiers in Systems Neuroscience Systems Neuroscience , 2022 .

[140]  P. D. Spear,et al.  How complete is physiological compensation in extrastriate cortex after visual cortex damage in kittens? , 2004, Experimental Brain Research.

[141]  Wolfgang Maass,et al.  Networks of Spiking Neurons: The Third Generation of Neural Network Models , 1996, Electron. Colloquium Comput. Complex..

[142]  M. Goodale,et al.  The visual brain in action , 1995 .

[143]  Robert W. Sussman,et al.  Primate origins and the evolution of angiosperms , 1991, American journal of primatology.

[144]  Leslie G. Ungerleider Two cortical visual systems , 1982 .