Deep convolutional models improve predictions of macaque V1 responses to natural images

Despite great efforts over several decades, our best models of primary visual cortex (V1) still predict spiking activity quite poorly when probed with natural stimuli, highlighting our limited understanding of the nonlinear computations in V1. Recently, two approaches based on deep learning have emerged for modeling these nonlinear computations: transfer learning from artificial neural networks trained on object recognition and data-driven convolutional neural network models trained end-to-end on large populations of neurons. Here, we test the ability of both approaches to predict spiking activity in response to natural images in V1 of awake monkeys. We found that the transfer learning approach performed similarly well to the data-driven approach and both outperformed classical linear-nonlinear and wavelet-based feature representations that build on existing theories of V1. Notably, transfer learning using a pre-trained feature space required substantially less experimental time to achieve the same performance. In conclusion, multi-layer convolutional neural networks (CNNs) set the new state of the art for predicting neural responses to natural images in primate V1 and deep features learned for object recognition are better explanations for V1 computation than all previous filter bank theories. This finding strengthens the necessity of V1 models that are multiple nonlinearities away from the image domain and it supports the idea of explaining early visual cortex based on high-level functional goals.

[1]  Ryan J. Prenger,et al.  The Berkeley Wavelet Transform: A Biologically Inspired Orthogonal Wavelet Transform , 2008, Neural Computation.

[2]  Dario L. Ringach,et al.  Dynamics of orientation tuning in macaque primary visual cortex , 1997, Nature.

[3]  Alexander S. Ecker,et al.  Decorrelated Neuronal Firing in Cortical Microcircuits , 2010, Science.

[4]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  D. Hubel,et al.  Receptive fields of single neurones in the cat's striate cortex , 1959, The Journal of physiology.

[6]  Eero P. Simoncelli,et al.  To appear in: The New Cognitive Neurosciences, 3rd edition Editor: M. Gazzaniga. MIT Press, 2004. Characterization of Neural Responses with Stochastic Stimuli , 2022 .

[7]  David J. Field,et al.  How Close Are We to Understanding V1? , 2005, Neural Computation.

[8]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Marcel A. J. van Gerven,et al.  Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream , 2014, The Journal of Neuroscience.

[10]  L. Spillmann,et al.  Beyond the classical receptive field: The effect of contextual stimuli. , 2015, Journal of vision.

[11]  E H Adelson,et al.  Spatiotemporal energy models for the perception of motion. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[12]  J. Touryan,et al.  Spatial Structure of Complex Cell Receptive Fields Measured with Natural Images , 2005, Neuron.

[13]  Ha Hong,et al.  A performance-optimized model of neural responses across the ventral visual stream , 2016, bioRxiv.

[14]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[15]  Antonio Torralba,et al.  Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence , 2016, Scientific Reports.

[16]  William F. Kindel,et al.  Using deep learning to reveal the neural code for images in primary visual cortex , 2017, ArXiv.

[17]  J. P. Jones,et al.  An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[18]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[19]  Alexander S. Ecker,et al.  Attentional fluctuations induce shared variability in macaque primary visual cortex , 2017, Nature Communications.

[20]  J. Movshon,et al.  Receptive field organization of complex cells in the cat's striate cortex. , 1978, The Journal of physiology.

[21]  J. DiCarlo,et al.  Using goal-driven deep learning models to understand sensory cortex , 2016, Nature Neuroscience.

[22]  Eero P. Simoncelli,et al.  A Convolutional Subunit Model for Neuronal Responses in Macaque V1 , 2015, The Journal of Neuroscience.

[23]  Surya Ganguli,et al.  Deep Learning Models of the Retinal Response to Natural Scenes , 2017, NIPS.

[24]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25]  Alexander S. Ecker,et al.  DataJoint: managing big scientific data using MATLAB or Python , 2015, bioRxiv.

[26]  Ryan J. Prenger,et al.  Nonlinear V1 responses to natural scenes revealed by neural network analysis , 2004, Neural Networks.

[27]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[28]  B. Willmore,et al.  Neural Representation of Natural Images in Visual Area V2 , 2010, The Journal of Neuroscience.

[29]  D. Hubel,et al.  Receptive fields and functional architecture of monkey striate cortex , 1968, The Journal of physiology.

[30]  Alexander S. Ecker,et al.  State Dependence of Noise Correlations in Macaque Primary Visual Cortex , 2014, Neuron.

[31]  Athanassios G. Siapas,et al.  Model-based spike sorting with a mixture of drifting t-distributions , 2017, Journal of Neuroscience Methods.

[32]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[33]  Liam Paninski,et al.  Multilayer Recurrent Network Models of Primate Retinal Ganglion Cell Responses , 2016, ICLR.

[34]  Ming Li,et al.  Convolutional neural network models of V1 responses to complex patterns , 2018, Journal of Computational Neuroscience.

[35]  Richard A. Andersen,et al.  A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons , 1988, Nature.

[36]  David J. Field,et al.  What Is the Goal of Sensory Coding? , 1994, Neural Computation.

[37]  Shiming Tang,et al.  Complex Pattern Selectivity in Macaque Primary Visual Cortex Revealed by Large-Scale Two-Photon Imaging , 2018, Current Biology.

[38]  J. Movshon,et al.  Time Course and Time-Distance Relationships for Surround Suppression in Macaque V1 Neurons , 2003, The Journal of Neuroscience.

[39]  Matthias Bethge,et al.  Hierarchical Modeling of Local Image Features through $L_p$-Nested Symmetric Distributions , 2009, NIPS.

[40]  Nikolaus Kriegeskorte,et al.  Deep neural networks: a new framework for modelling biological vision and brain information processing , 2015, bioRxiv.

[41]  Alexander S. Ecker,et al.  Neural system identification for large populations separating "what" and "where" , 2017, NIPS.

[42]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[43]  Curtis L Baker,et al.  Natural versus Synthetic Stimuli for Estimating Receptive Field Models: A Comparison of Predictive Robustness , 2012, The Journal of Neuroscience.

[44]  Matthias Bethge,et al.  Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet , 2014, ICLR.

[45]  D. Heeger Normalization of cell responses in cat striate cortex , 1992, Visual Neuroscience.

[46]  Liam Paninski,et al.  Kalman Filter Mixture Model for Spike Sorting of Non-stationary Data , 2010 .

[47]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[48]  Graham W. Taylor,et al.  Deconvolutional networks , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[49]  D. Heeger Half-squaring in responses of cat striate cells , 1992, Visual Neuroscience.

[50]  R. Shapley,et al.  Orientation Selectivity in Macaque V1: Diversity and Laminar Dependence , 2002, The Journal of Neuroscience.

[51]  Andrew B. Watson,et al.  The cortex transform: rapid computation of simulated neural images , 1987 .

[52]  J. Daugman Two-dimensional spectral analysis of cortical receptive field profiles , 1980, Vision Research.

[53]  Dirk Merkel,et al.  Docker: lightweight Linux containers for consistent development and deployment , 2014 .

[54]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Leon A. Gatys,et al.  Texture Synthesis Using Convolutional Neural Networks , 2015, NIPS.

[56]  Matthias Bethge,et al.  Natural Image Coding in V1: How Much Use Is Orientation Selectivity? , 2008, PLoS Comput. Biol..

[57]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[58]  Nicole C. Rust,et al.  Do We Know What the Early Visual System Does? , 2005, The Journal of Neuroscience.

[59]  Eero P. Simoncelli,et al.  Spatiotemporal Elements of Macaque V1 Receptive Fields , 2005, Neuron.

[60]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[61]  J. Movshon,et al.  Selectivity and spatial distribution of signals from the receptive field surround in macaque V1 neurons. , 2002, Journal of neurophysiology.

[62]  M. Carandini,et al.  Normalization as a canonical neural computation , 2011, Nature Reviews Neuroscience.

[63]  J. Movshon,et al.  Nature and interaction of signals from the receptive field center and surround in macaque V1 neurons. , 2002, Journal of neurophysiology.

[64]  et al.,et al.  Jupyter Notebooks - a publishing format for reproducible computational workflows , 2016, ELPUB.

[65]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[66]  James A. Bednar,et al.  Model Constrained by Visual Hierarchy Improves Prediction of Neural Responses to Natural Scenes , 2016, PLoS Comput. Biol..

[67]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[68]  J. Movshon,et al.  Spatial summation in the receptive fields of simple cells in the cat's striate cortex. , 1978, The Journal of physiology.

[69]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[70]  Daniel L. K. Yamins,et al.  Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..

[71]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[72]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[73]  Ieee Xplore Computing in science & engineering , 1999 .