Using goal-driven deep learning models to understand sensory cortex

Fueled by innovation in the computer vision and artificial intelligence communities, recent developments in computational neuroscience have used goal-driven hierarchical convolutional neural networks (HCNNs) to make strides in modeling neural single-unit and population responses in higher visual cortical areas. In this Perspective, we review the recent progress in a broader modeling context and describe some of the key technical innovations that have supported it. We then outline how the goal-driven HCNN approach can be used to delve even more deeply into understanding the development and organization of sensory cortical processing.

[1]  D. Hubel,et al.  Receptive fields of single neurones in the cat's striate cortex , 1959, The Journal of physiology.

[2]  J. Movshon,et al.  Spatial summation in the receptive fields of simple cells in the cat's striate cortex. , 1978, The Journal of physiology.

[3]  E. Yund,et al.  Responses of striate cortex cells to grating and checkerboard patterns. , 1979, The Journal of physiology.

[4]  H. Barlow Vision: A computational investigation into the human representation and processing of visual information: David Marr. San Francisco: W. H. Freeman, 1982. pp. xvi + 397 , 1983 .

[5]  S. Ullman Visual routines , 1984, Cognition.

[6]  G. Hoyle The scope of neuroethology , 1984, Behavioral and Brain Sciences.

[7]  J. P. Jones,et al.  An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[8]  D. J. Felleman,et al.  Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.

[9]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[10]  N. Logothetis,et al.  Shape representation in the inferior temporal cortex of monkeys , 1995, Current Biology.

[11]  W Singer,et al.  Visual feature integration and the temporal correlation hypothesis. , 1995, Annual review of neuroscience.

[12]  D. C. Essen,et al.  Neural responses to polar, hyperbolic, and Cartesian gratings in area V4 of the macaque monkey. , 1996, Journal of neurophysiology.

[13]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[14]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[15]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[16]  A. Zador,et al.  Neural representation and the cortical code. , 2000, Annual review of neuroscience.

[17]  Bruno A. Olshausen,et al.  Learning Sparse Image Codes using a Wavelet Pyramid Architecture , 2000, NIPS.

[18]  David J. Freedman,et al.  Categorical representation of visual stimuli in the primate prefrontal cortex. , 2001, Science.

[19]  C. Connor,et al.  Population coding of shape in area V4 , 2002, Nature Neuroscience.

[20]  R. Malach,et al.  The topography of high-order human object areas , 2002, Trends in Cognitive Sciences.

[21]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[22]  Nicole C. Rust,et al.  Do We Know What the Early Visual System Does? , 2005, The Journal of Neuroscience.

[23]  Jonathan Z. Simon,et al.  Stimulus-invariant processing and spectrotemporal reverse correlation in primary auditory cortex , 2005, Journal of Computational Neuroscience.

[24]  Tomaso Poggio,et al.  Fast Readout of Object Identity from Macaque Inferior Temporal Cortex , 2005, Science.

[25]  Eero P. Simoncelli,et al.  How MT cells analyze the motion of visual patterns , 2006, Nature Neuroscience.

[26]  J. Gallant,et al.  Spectral receptive field properties explain shape selectivity in area V4. , 2006, Journal of neurophysiology.

[27]  Thomas Serre,et al.  A feedforward architecture accounts for rapid categorization , 2007, Proceedings of the National Academy of Sciences.

[28]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[29]  Tomaso Poggio,et al.  Trade-Off between Object Selectivity and Tolerance in Monkey Inferotemporal Cortex , 2007, The Journal of Neuroscience.

[30]  David D. Cox,et al.  Opinion TRENDS in Cognitive Sciences Vol.11 No.8 Untangling invariant object recognition , 2022 .

[31]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[32]  Eric T. Carlson,et al.  A neural code for three-dimensional object shape in macaque inferotemporal cortex , 2008, Nature Neuroscience.

[33]  Eero P. Simoncelli,et al.  Spatio-temporal correlations and visual signalling in a complete neuronal population , 2008, Nature.

[34]  Nicolas Pinto,et al.  Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[35]  Long Zhu,et al.  Unsupervised Structure Learning: Hierarchical Recursive Composition, Suspicious Coincidence and Competitive Exclusion , 2008, ECCV.

[36]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[37]  Nikolaus Kriegeskorte,et al.  Relating Population-Code Representations between Man, Monkey, and Computational Models , 2009, Front. Neurosci..

[38]  Peter Norvig,et al.  The Unreasonable Effectiveness of Data , 2009, IEEE Intelligent Systems.

[39]  David D. Cox,et al.  A High-Throughput Screening Approach to Discovering Good Forms of Biologically Inspired Visual Representation , 2009, PLoS Comput. Biol..

[40]  Nicole C. Rust,et al.  Selectivity and Tolerance (“Invariance”) Both Increase as Visual Information Propagates from Cortical Area V4 to IT , 2010, The Journal of Neuroscience.

[41]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[42]  Eero P. Simoncelli,et al.  Metamers of the ventral stream , 2011, Nature Neuroscience.

[43]  M. Carandini,et al.  Normalization as a canonical neural computation , 2011, Nature Reviews Neuroscience.

[44]  H. B. Barlow,et al.  Possible Principles Underlying the Transformations of Sensory Messages , 2012 .

[45]  Kwabena Boahen,et al.  Silicon Neurons That Compute , 2012, ICANN.

[46]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[47]  James J. DiCarlo,et al.  How Does the Brain Solve Visual Object Recognition? , 2012, Neuron.

[48]  Yoshua Bengio,et al.  Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.

[49]  Christopher D. Harvey,et al.  Choice-specific sequences in parietal cortex during a virtual-navigation decision task , 2012, Nature.

[50]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[51]  J. Reynolds,et al.  Trade-off between curvature tuning and position invariance in visual area V4 , 2013, Proceedings of the National Academy of Sciences.

[52]  David D. Cox,et al.  Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.

[53]  Nicole C. Rust,et al.  Signals in inferotemporal and perirhinal cortex suggest an “untangling” of visual target information , 2013, Nature Neuroscience.

[54]  W. Newsome,et al.  Context-dependent computation by recurrent dynamics in prefrontal cortex , 2013, Nature.

[55]  Ha Hong,et al.  Hierarchical Modular Optimization of Convolutional Networks Achieves Representations Similar to Macaque IT and Human Ventral Stream , 2013, NIPS.

[56]  C. Connor,et al.  Curvature processing dynamics in macaque area V4. , 2013, Cerebral cortex.

[57]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[58]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[59]  Daniel L. K. Yamins,et al.  Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..

[60]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[61]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[62]  Ha Hong,et al.  Simple Learned Weighted Sums of Inferior Temporal Neuronal Firing Rates Accurately Predict Human Core Object Recognition Performance , 2015, The Journal of Neuroscience.

[63]  Thomas S. Huang,et al.  Do Deep Neural Networks Learn Facial Action Units When Doing Expression Recognition? , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[64]  Sergey Levine,et al.  Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.

[65]  Eve Marder,et al.  Understanding Brains: Details, Intuition, and Big Data , 2015, PLoS biology.

[66]  K. Norman,et al.  Neural Differentiation Tracks Improved Recall of Competing Memories Following Interleaved Study and Retrieval Practice. , 2015, Cerebral cortex.

[67]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[68]  J. DiCarlo,et al.  Optogenetic and pharmacological suppression of spatial clusters of face neurons reveal their causal role in face gender discrimination , 2015, Proceedings of the National Academy of Sciences.

[69]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[70]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[71]  Marcel A. J. van Gerven,et al.  Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream , 2014, The Journal of Neuroscience.

[72]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[73]  Leon A. Gatys,et al.  A Neural Algorithm of Artistic Style , 2015, ArXiv.