Development of localized oriented receptive fields by learning a translation-invariant code for natural images.

Neurons in the mammalian primary visual cortex are known to possess spatially localized, oriented receptive fields. It has previously been suggested that these distinctive properties may reflect an efficient image encoding strategy based on maximizing the sparseness of the distribution of output neuronal activities or alternately, extracting the independent components of natural image ensembles. Here, we show that a strategy for transformation-invariant coding of images based on a first-order Taylor series expansion of an image also causes localized, oriented receptive fields to be learned from natural image inputs. These receptive fields, which approximate localized first-order differential operators at various orientations, allow a pair of cooperating neural networks, one estimating object identity ('what') and the other estimating object transformations ('where'), to simultaneously recognize an object and estimate its pose by jointly maximizing the a posteriori probability of generating the observed visual data. We provide experimental results demonstrating the ability of such networks to factor retinal stimuli into object-centred features and object-invariant transformation estimates.

[1]  W. Pitts,et al.  How we know universals; the perception of auditory and visual forms. , 1947, The Bulletin of mathematical biophysics.

[2]  D. Hubel,et al.  Receptive fields and functional architecture of monkey striate cortex , 1968, The Journal of physiology.

[3]  Arthur E. Bryson,et al.  Applied Optimal Control , 1969 .

[4]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[5]  D. B. Bender,et al.  Visual properties of neurons in inferotemporal cortex of the Macaque. , 1972, Journal of neurophysiology.

[6]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[7]  Geoffrey E. Hinton A Parallel Computation that Assigns Canonical Object-Based Frames of Reference , 1981, IJCAI.

[8]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[9]  E. Bienenstock,et al.  Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex , 1982, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[10]  E. Adelson,et al.  Phenomenal coherence of moving visual patterns , 1982, Nature.

[11]  P. Dodwell The Lie transformation group model of visual perception , 1983, Perception & psychophysics.

[12]  R. Young GAUSSIAN DERIVATIVE THEORY OF SPATIAL VISION: ANALYSIS OF CORTICAL CELL RECEPTIVE FIELD LINE-WEIGHTING PROFILES. , 1985 .

[13]  V. Brooks The Neural Basis of Motor Control , 1986 .

[14]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[15]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[16]  Ralph Linsker,et al.  Self-organization in a perceptual network , 1988, Computer.

[17]  Richard A. Andersen,et al.  A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons , 1988, Nature.

[18]  Erkki Oja,et al.  Neural Networks, Principal Components, and Subspaces , 1989, Int. J. Neural Syst..

[19]  M. Tarr,et al.  Mental rotation and orientation-dependence in shape recognition , 1989, Cognitive Psychology.

[20]  Terence D. Sanger,et al.  Optimal unsupervised learning in a single-layer linear feedforward neural network , 1989, Neural Networks.

[21]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[22]  Jack D. Cowan,et al.  Neural Networks: The Early Days , 1989, NIPS.

[23]  Geoffrey E. Hinton,et al.  Distributed Representations , 1986, The Philosophy of Artificial Intelligence.

[24]  T. Poggio,et al.  A network that learns to recognize three-dimensional objects , 1990, Nature.

[25]  Peter Földiák,et al.  Learning Invariance from Transformation Sequences , 1991, Neural Comput..

[26]  R. Baddeley,et al.  A statistical analysis of natural images matches psychophysically derived orientation tuning curves , 1991, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[27]  R. Wurtz,et al.  Sensitivity of MST neurons to optic flow stimuli. I. A continuum of response selectivity to large-field stimuli. , 1991, Journal of neurophysiology.

[28]  D. J. Felleman,et al.  Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.

[29]  C. Webber,et al.  Self-organization of position- and deformation-tolerant neural representations , 1991 .

[30]  John M. Libert,et al.  A Lie group approach to a neural system for three-dimensional interpretation of visual motion , 1991, IEEE Trans. Neural Networks.

[31]  Harry G. Barrow,et al.  A Model of Adaptive Development of Complex Cortical Cells , 1992 .

[32]  Michael I. Jordan,et al.  Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[33]  J R Duhamel,et al.  The updating of the representation of visual space in parietal cortex by intended eye movements. , 1992, Science.

[34]  Leslie S. Smith,et al.  The principal components of natural images , 1992 .

[35]  Yann LeCun,et al.  Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.

[36]  Joseph J. Atick,et al.  What Does the Retina Know about Natural Scenes? , 1992, Neural Computation.

[37]  Edward H. Adelson,et al.  Shiftable multiscale transforms , 1992, IEEE Trans. Inf. Theory.

[38]  G. Wallis,et al.  Learning invariant responses to the natural transformations of objects , 1993, Proceedings of 1993 International Conference on Neural Networks (IJCNN-93-Nagoya, Japan).

[39]  Joseph J. Atick,et al.  Convergent Algorithm for Sensory Receptive Field Development , 1993, Neural Computation.

[40]  Wolfgang Konen,et al.  A fast dynamic link matching algorithm for invariant pattern recognition , 1994, Neural Networks.

[41]  M. Ferraro,et al.  Lie transformation groups, integral transforms, and invariant pattern recognition. , 1994, Spatial vision.

[42]  K. Nordberg Signal Representation and Processing using Operator Groups , 1994 .

[43]  Horace Barlow,et al.  What is the computational goal of the neocortex , 1994 .

[44]  N. Logothetis,et al.  View-dependent object recognition by monkeys , 1994, Current Biology.

[45]  Terrence J. Sejnowski,et al.  Spatial Representations in the Parietal Cortex May Use Basis Functions , 1994, NIPS.

[46]  David J. Field,et al.  What Is the Goal of Sensory Coding? , 1994, Neural Computation.

[47]  H H Bülthoff,et al.  How are three-dimensional objects represented in the brain? , 1994, Cerebral cortex.

[48]  Luc Van Gool,et al.  Vision and Lie's approach to invariance , 1995, Image Vis. Comput..

[49]  L F Abbott,et al.  Transfer of coded information from sensory to motor networks , 1995, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[50]  Rajesh P. N. Rao,et al.  A Class of Stochastic Models for Invariant Recognition, Motion, and Stereo , 1996 .

[51]  Michael J. Black,et al.  EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, ECCV.

[52]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[53]  James V. Stone Learning Perceptually Salient Visual Parameters Using Spatiotemporal Smoothness Constraints , 1996, Neural Computation.

[54]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[55]  Joshua B. Tenenbaum,et al.  Learning bilinear models for two-factor problems in vision , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[56]  Rajesh P. N. Rao,et al.  Efficient Encoding of Natural Time Varying Images Produces Oriented Space-Time Receptive Fields , 1997 .

[57]  Samuel Kaski,et al.  Self-Organized Formation of Various Invariant-Feature Filters in the Adaptive-Subspace SOM , 1997, Neural Computation.

[58]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[59]  Rajesh P. N. Rao,et al.  Dynamic Model of Visual Recognition Predicts Neural Response Properties in the Visual Cortex , 1997, Neural Computation.

[60]  Terrence J. Sejnowski,et al.  Learning Nonlinear Overcomplete Representations for Efficient Coding , 1997, NIPS.

[61]  Laurenz Wiskott Learning Invariance Manifolds , 1998 .

[62]  Rajesh P. N. Rao,et al.  Learning Lie Groups for Invariant Visual Perception , 1998, NIPS.