A three-layer model of natural image statistics

An important property of visual systems is to be simultaneously both selective to specific patterns found in the sensory input and invariant to possible variations. Selectivity and invariance (tolerance) are opposing requirements. It has been suggested that they could be joined by iterating a sequence of elementary selectivity and tolerance computations. It is, however, unknown what should be selected or tolerated at each level of the hierarchy. We approach this issue by learning the computations from natural images. We propose and estimate a probabilistic model of natural images that consists of three processing layers. Two natural image data sets are considered: image patches, and complete visual scenes downsampled to the size of small patches. For both data sets, we find that in the first two layers, simple and complex cell-like computations are performed. In the third layer, we mainly find selectivity to longer contours; for patch data, we further find some selectivity to texture, while for the downsampled complete scenes, some selectivity to curvature is observed.

[1]  James A. Bednar,et al.  Building a mechanistic model of the development and function of the primary visual cortex , 2012, Journal of Physiology-Paris.

[2]  B. Willmore,et al.  Neural Representation of Natural Images in Visual Area V2 , 2010, The Journal of Neuroscience.

[3]  Geoffrey E. Hinton,et al.  Modeling pixel means and covariances using factorized third-order boltzmann machines , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  J. Daugman Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[5]  Eero P. Simoncelli,et al.  Metamers of the ventral stream , 2011, Nature Neuroscience.

[6]  Konrad P. Körding,et al.  The world from a cat’s perspective – statistics of natural videos , 2003, Biological Cybernetics.

[7]  Geoffrey E. Hinton,et al.  Topographic Product Models Applied to Natural Scene Statistics , 2006, Neural Computation.

[8]  Michael S. Lewicki,et al.  Emergence of complex cell properties by learning to generalize in natural scenes , 2009, Nature.

[9]  Aapo Hyvärinen,et al.  Natural Image Statistics - A Probabilistic Approach to Early Computational Vision , 2009, Computational Imaging and Vision.

[10]  D. C. Essen,et al.  Neurons in monkey visual area V2 encode combinations of orientations , 2007, Nature Neuroscience.

[11]  Aapo Hyv A Two-Layer Model of Natural Stimuli Estimated with Score Matching , 2010 .

[12]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[13]  Tomaso A. Poggio,et al.  A Canonical Neural Circuit for Cortical Nonlinear Operations , 2008, Neural Computation.

[14]  Nicole C Rust,et al.  Ambiguity and invariance: two fundamental challenges for visual processing , 2010, Current Opinion in Neurobiology.

[15]  I. Ohzawa,et al.  Local sensitivity to stimulus orientation and spatial frequency within the receptive fields of neurons in visual area 2 of macaque monkeys. , 2012, Journal of neurophysiology.

[16]  Fu Jie Huang,et al.  A Tutorial on Energy-Based Learning , 2006 .

[17]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[18]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[19]  Thomas Serre,et al.  A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex , 2005 .

[20]  Aapo Hyvärinen,et al.  Learning a selectivity-invariance-selectivity feature extraction architecture for images , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[21]  S. Nelson,et al.  Hebb and homeostasis in neuronal plasticity , 2000, Current Opinion in Neurobiology.

[22]  David D. Cox,et al.  Untangling invariant object recognition , 2007, Trends in Cognitive Sciences.

[23]  Antonio Torralba,et al.  Statistics of natural image categories , 2003, Network.

[24]  E H Adelson,et al.  Spatiotemporal energy models for the perception of motion. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[25]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[26]  Eero P. Simoncelli,et al.  Natural image statistics and neural representation. , 2001, Annual review of neuroscience.

[27]  K. Miller,et al.  Ocular dominance column development: analysis and simulation. , 1989, Science.

[28]  Eero P. Simoncelli,et al.  Natural signal statistics and sensory gain control , 2001, Nature Neuroscience.

[29]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[30]  B. Willmore,et al.  Sparse coding in striate and extrastriate visual cortex. , 2011, Journal of neurophysiology.

[31]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[32]  Bruno A. Olshausen,et al.  Learning Intermediate-Level Representations of Form and Motion from Natural Movies , 2012, Neural Computation.

[33]  H Barlow,et al.  Redundancy reduction revisited , 2001, Network.

[34]  Gökhan BakIr,et al.  Predicting Structured Data , 2008 .

[35]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[36]  J. Hegdé,et al.  Selectivity for Complex Shapes in Primate Visual Area V2 , 2000, The Journal of Neuroscience.

[37]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[38]  Aapo Hyvärinen,et al.  Statistical model of natural stimuli predicts edge-like pooling of spatial frequency channels in V2 , 2004, BMC Neuroscience.

[39]  G. Michael A three-layer model of natural image statistics , 2010 .

[40]  Aapo Hyvärinen,et al.  Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics , 2012, J. Mach. Learn. Res..

[41]  W. Wildman,et al.  Theoretical Neuroscience , 2014 .

[42]  Yair Weiss,et al.  The 'tree-dependent components' of natural scenes are edge filters , 2009, NIPS.

[43]  Nicole C. Rust,et al.  Selectivity and Tolerance (“Invariance”) Both Increase as Visual Information Propagates from Cortical Area V4 to IT , 2010, The Journal of Neuroscience.

[44]  Scott T. Rickard,et al.  Comparing Measures of Sparsity , 2008, IEEE Transactions on Information Theory.

[45]  S. Sutherland Eye, brain and vision , 1993, Nature.

[46]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  J. Anthony Movshon,et al.  Neuronal Responses to Texture-Defined Form in Macaque Visual Area V2 , 2011, The Journal of Neuroscience.

[48]  D. Ringach Mapping receptive fields in primary visual cortex , 2004, The Journal of physiology.