Deep Learning Models of the Retinal Response to Natural Scenes

A central challenge in sensory neuroscience is to understand neural computations and circuit mechanisms that underlie the encoding of ethologically relevant, natural stimuli. In multilayered neural circuits, nonlinear processes such as synaptic transmission and spiking dynamics present a significant obstacle to the creation of accurate computational models of responses to natural stimuli. Here we demonstrate that deep convolutional neural networks (CNNs) capture retinal responses to natural scenes nearly to within the variability of a cell's response, and are markedly more accurate than linear-nonlinear (LN) models and Generalized Linear Models (GLMs). Moreover, we find two additional surprising properties of CNNs: they are less susceptible to overfitting than their LN counterparts when trained on small amounts of data, and generalize better when tested on stimuli drawn from a different distribution (e.g. between natural scenes and white noise). An examination of the learned CNNs reveals several properties. First, a richer set of feature maps is necessary for predicting the responses to natural scenes compared to white noise. Second, temporally precise responses to slowly varying inputs originate from feedforward inhibition, similar to known retinal mechanisms. Third, the injection of latent noise sources in intermediate layers enables our model to capture the sub-Poisson spiking variability observed in retinal ganglion cells. Fourth, augmenting our CNNs with recurrent lateral connections enables them to capture contrast adaptation as an emergent property of accurately describing retinal responses to natural scenes. These methods can be readily generalized to other sensory modalities and stimulus ensembles. Overall, this work demonstrates that CNNs not only accurately capture sensory circuit responses to natural scenes, but also can yield information about the circuit's internal structure and function.

[1]  R. Shapley,et al.  Linear and nonlinear spatial subunits in Y cat retinal ganglion cells. , 1976, The Journal of physiology.

[2]  Joseph J. Atick,et al.  Towards a Theory of Early Visual Processing , 1990, Neural Computation.

[3]  Michael J. Berry,et al.  Adaptation of retinal processing to image contrast and spatial scale , 1997, Nature.

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  G D Lewen,et al.  Reproducibility and Variability in Neural Spike Trains , 1997, Science.

[6]  Michael J. Berry,et al.  The structure and precision of retinal spike trains. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[7]  E J Chichilnisky,et al.  A simple white noise analysis of neuronal light responses , 2001, Network.

[8]  F. Werblin,et al.  Vertical interactions across ten parallel, stacked representations in the mammalian retina , 2001, Nature.

[9]  V. Arshavsky,et al.  Two Temporal Phases of Light Adaptation in Retinal Rods , 2002, The Journal of general physiology.

[10]  M. Meister,et al.  Fast and Slow Contrast Adaptation in Retinal Circuitry , 2002, Neuron.

[11]  F. Werblin,et al.  Rapid global shifts in natural scenes block spiking in specific ganglion cell types , 2003, Nature Neuroscience.

[12]  Stephen A. Baccus,et al.  Segregation of object and background motion in the retina , 2003, Nature.

[13]  Eero P. Simoncelli,et al.  Spatiotemporal Elements of Macaque V1 Receptive Fields , 2005, Neuron.

[14]  M. Meister,et al.  Dynamic predictive coding by the retina , 2005, Nature.

[15]  Nicole C. Rust,et al.  In praise of artifice , 2005, Nature Neuroscience.

[16]  E J Chichilnisky,et al.  Prediction and Decoding of Retinal Ganglion Cell Responses with a Probabilistic Spiking Model , 2005, The Journal of Neuroscience.

[17]  Michael J. Berry,et al.  Selectivity for multiple stimulus features in retinal ganglion cells. , 2006, Journal of neurophysiology.

[18]  Michael J. Berry,et al.  Detection and prediction of periodic patterns by the retina , 2007, Nature Neuroscience.

[19]  Tim Gollisch,et al.  Rapid Neural Coding in the Retina with Relative Spike Latencies , 2008, Science.

[20]  Eero P. Simoncelli,et al.  Spatio-temporal correlations and visual signalling in a complete neuronal population , 2008, Nature.

[21]  Aapo Hyvärinen,et al.  Natural Image Statistics - A Probabilistic Approach to Early Computational Vision , 2009, Computational Imaging and Vision.

[22]  Tim Gollisch,et al.  Eye Smarter than Scientists Believed: Neural Computations in Circuits of the Retina , 2010, Neuron.

[23]  Vijay Balasubramanian,et al.  Natural Images from the Birthplace of the Human Eye , 2011, PloS one.

[24]  S. Baccus,et al.  Coordinated dynamic encoding in the retina using opposing forms of plasticity , 2011, Nature Neuroscience.

[25]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[26]  M. Meister,et al.  Decorrelation and efficient coding by retinal ganglion cells , 2012, Nature Neuroscience.

[27]  T. Gollisch Features and functions of nonlinear spatial integration by retinal ganglion cells , 2013, Journal of Physiology-Paris.

[28]  Ha Hong,et al.  Hierarchical Modular Optimization of Convolutional Networks Achieves Representations Similar to Macaque IT and Human Ventral Stream , 2013, NIPS.

[29]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[30]  Surya Ganguli,et al.  Analyzing noise in autoencoders and deep networks , 2014, ArXiv.

[31]  Wojciech Zaremba,et al.  An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[32]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34]  Eero P. Simoncelli,et al.  Testing pseudo-linear models of responses to natural scenes in primate retina , 2016, bioRxiv.