Invariance of visual operations at the level of receptive fields

The brain is able to maintain a stable perception although the visual stimuli vary substantially on the retina due to geometric transformations and lighting variations in the environment. This paper presents a theory for achieving basic invariance properties already at the level of receptive fields. Specifically, the presented framework comprises (i) local scaling transformations caused by objects of different size and at different distances to the observer, (ii) locally linearized image deformations caused by variations in the viewing direction in relation to the object, (iii) locally linearized relative motions between the object and the observer and (iv) local multiplicative intensity transformations caused by illumination variations. The receptive field model can be derived by necessity from symmetry properties of the environment and leads to predictions about receptive field profiles in good agreement with receptive field profiles measured by cell recordings in mammalian vision. Indeed, the receptive field profiles in the retina, LGN and V1 are close to ideal to what is motivated by the idealized requirements. By complementing receptive field measurements with selection mechanisms over the parameters in the receptive field families, it is shown how true invariance of receptive field responses can be obtained under scaling transformations, affine transformations and Galilean transformations. Thereby, the framework provides a mathematically well-founded and biologically plausible model for how basic invariance properties can be achieved already at the level of receptive fields and support invariant recognition of objects and events under variations in viewpoint, retinal size, object motion and illumination. The theory can explain the different shapes of receptive field profiles found in biological vision, which are tuned to different sizes and orientations in the image domain as well as to different image velocities in spacetime, from a requirement that the visual system should be invariant to the natural types of image transformations that occur in its environment. Citation: Lindeberg T (2013) Invariance of visual operations at the level of receptive fields. PLoS ONE 8(7): e66990. doi:10.1371/journal.pone.0066990 Editor: Luis M Martinez, CSIC-Univ Miguel Hernandez, Spain Received October 16, 2012; Accepted May 14, 2013; Published July 19, 2013 Copyright: 2013 Tony Lindeberg. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: Funding was received from The Swedish Research Council contract 2010–4766; The Royal Swedish Academy of Sciences; and The Knut and Alice Wallenberg foundation. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The author has declared that no competing interests exist. * E-mail: tony@csc.kth.se

[1]  J. Koenderink,et al.  Representation of local geometry in the visual system , 1987, Biological Cybernetics.

[2]  E. Rolls Brain mechanisms for invariant visual recognition and learning , 1994, Behavioural Processes.

[3]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[4]  Li Zhaoping Optimal sensory encoding , 2002 .

[5]  C. Koch,et al.  Invariant visual representation by single neurons in the human brain , 2005, Nature.

[6]  I. Ohzawa,et al.  Receptive-field dynamics in the central visual pathways , 1995, Trends in Neurosciences.

[7]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[8]  Jean-Michel Morel,et al.  Scale Space or topographic map ? , 1997 .

[9]  J. P. Jones,et al.  An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[10]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[11]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[12]  P. C. Murphy,et al.  Feedback connections to the lateral geniculate nucleus and cortical response properties. , 1999, Science.

[13]  Michael S. Lewicki,et al.  Relations between the statistical regularities of natural images and the response properties of the early visual system , 2005 .

[14]  Minami Ito,et al.  Size and position invariance of neuronal responses in monkey inferotemporal cortex. , 1995, Journal of neurophysiology.

[15]  Aapo Hyvärinen,et al.  Natural Image Statistics - A Probabilistic Approach to Early Computational Vision , 2009, Computational Imaging and Vision.

[16]  Robbe L. T. Goris,et al.  Frontiers in Computational Neuroscience Computational Neuroscience Neural Representations That Support Invariant Object Recognition , 2022 .

[17]  Barbara Caputo,et al.  Local velocity-adapted motion events for spatio-temporal recognition , 2007, Comput. Vis. Image Underst..

[18]  R. Shapley,et al.  New perspectives on the mechanisms for orientation selectivity , 1997, Current Opinion in Neurobiology.

[19]  I. Biederman,et al.  Size invariance in visual object priming , 1992 .

[20]  P. König,et al.  Getting real—sensory processing of natural stimuli , 2010, Current Opinion in Neurobiology.

[21]  András Lörincz,et al.  Efficient Sparse Coding in Early Sensory Processing: Lessons from Signal Recovery , 2012, PLoS Comput. Biol..

[22]  Nicole C. Rust,et al.  Do We Know What the Early Visual System Does? , 2005, The Journal of Neuroscience.

[23]  Joachim Weickert,et al.  Anisotropic diffusion in image processing , 1996 .

[24]  Lars Bretzner,et al.  Real-Time Scale Selection in Hybrid Multi-scale Representations , 2003, Scale-Space.

[25]  Cordelia Schmid,et al.  A sparse texture representation using local affine regions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Rajesh P. N. Rao,et al.  Development of localized oriented receptive fields by learning a translation-invariant code for natural images. , 1998, Network.

[27]  Jan J. Koenderink,et al.  Local operations : the embodiment of geometry , 1992 .

[28]  Tony Lindeberg,et al.  Invariance of visual operations at the level of receptive fields , 2012, BMC Neuroscience.

[29]  Joel Davis Brain and Visual Perception: The Story of a 25-Year Collaboration , 2004 .

[30]  Lars Bretzner,et al.  Feature Tracking with Automatic Selection of Spatial Scales , 1998, Comput. Vis. Image Underst..

[31]  Henry J. Alitto,et al.  Corticothalamic feedback and sensory processing , 2003, Current Opinion in Neurobiology.

[32]  N. Logothetis,et al.  Shape representation in the inferior temporal cortex of monkeys , 1995, Current Biology.

[33]  D. V. van Essen,et al.  A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[34]  J. Koenderink The structure of images , 2004, Biological Cybernetics.

[35]  Bernt Schiele,et al.  Recognition without Correspondence using Multidimensional Receptive Field Histograms , 2004, International Journal of Computer Vision.

[36]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[37]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[38]  S Marcelja,et al.  Mathematical description of the responses of simple cortical cells. , 1980, Journal of the Optical Society of America.

[39]  Vivien A. Casagrande,et al.  Biophysics of Computation: Information Processing in Single Neurons , 1999 .

[40]  Tony Lindeberg,et al.  Direct computation of shape cues using scale-adapted spatial derivative operators , 1996, International Journal of Computer Vision.

[41]  Luc Van Gool,et al.  An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector , 2008, ECCV.

[42]  Tony Lindeberg,et al.  Object recognition using composed receptive field histograms of higher dimensionality , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[43]  J. Petitot The neurogeometry of pinwheels as a sub-Riemannian contact structure , 2003, Journal of Physiology-Paris.

[44]  Jean-Michel Morel,et al.  ASIFT: A New Framework for Fully Affine Invariant Image Comparison , 2009, SIAM J. Imaging Sci..

[45]  M. Mattia,et al.  Population dynamics of interacting spiking neurons. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[46]  Ronald M. Lesperance,et al.  The Gaussian derivative model for spatial-temporal vision: I. Cortical model. , 2001, Spatial vision.

[47]  Tony Lindeberg,et al.  Scale-Space Theory in Computer Vision , 1993, Lecture Notes in Computer Science.

[48]  J. Koenderink The brain a geometry engine , 1990, Psychological research.

[49]  S. Edelman,et al.  Orientation dependence in the recognition of familiar and novel views of three-dimensional objects , 1992, Vision Research.

[50]  Tomaso Poggio,et al.  Fast Readout of Object Identity from Macaque Inferior Temporal Cortex , 2005, Science.

[51]  Tony Lindeberg,et al.  Fingerprint enhancement by shape adaptation of scale-space operators with automatic scale selection , 2000, IEEE Trans. Image Process..

[52]  T. Lindeberg Scale-space with Causal Time Direction , 1996 .

[53]  H. Rodman,et al.  Single-unit analysis of pattern-motion selective properties in the middle temporal visual area (MT) , 2004, Experimental Brain Research.

[54]  T. Gawne,et al.  Responses of primate visual cortical V4 neurons to simultaneously presented stimuli. , 2002, Journal of neurophysiology.

[55]  Peter J. Burt,et al.  Enhanced image capture through fusion , 1993, 1993 (4th) International Conference on Computer Vision.

[56]  R. W. Rodieck Quantitative analysis of cat retinal ganglion cell response to visual stimuli. , 1965, Vision research.

[57]  Andrea J. van Doorn,et al.  Generic Neighborhood Operators , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[58]  A. Przybyszewski,et al.  Vision: Does top-down processing help us to see? , 1998, Current Biology.

[59]  Rajesh P. N. Rao,et al.  Bilinear Sparse Coding for Invariant Vision , 2005, Neural Computation.

[60]  Olivier D. Faugeras,et al.  A Constructive Mean-Field Analysis of Multi-Population Neural Networks with Random Synaptic Weights and Stochastic Inputs , 2008, Front. Comput. Neurosci..

[61]  Eero P. Simoncelli,et al.  Natural image statistics and neural representation. , 2001, Annual review of neuroscience.

[62]  S. Nelson,et al.  An emergent model of orientation selectivity in cat visual cortical simple cells , 1995, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[63]  Lawrence Sirovich,et al.  On the Simulation of Large Populations of Neurons , 2004, Journal of Computational Neuroscience.

[64]  Ivan Laptev,et al.  Local Descriptors for Spatio-temporal Recognition , 2004, SCVMA.

[65]  C. Furmanski,et al.  Perceptual learning in object recognition: object specificity and size invariance , 2000, Vision Research.

[66]  Bart M. ter Haar Romeny,et al.  Front-End Vision and Multi-Scale Image Analysis , 2003, Computational Imaging and Vision.

[67]  T. Lindeberg,et al.  Velocity-adapted spatio-temporal receptive fields for direct recognition of activities , 2002 .

[68]  Cordelia Schmid,et al.  3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints , 2006, International Journal of Computer Vision.

[69]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[70]  P. Lions,et al.  Axioms and fundamental equations of image processing , 1993 .

[71]  Andrew P. Witkin,et al.  Scale-Space Filtering , 1983, IJCAI.

[72]  E. Brenner,et al.  The difference between the perception of absolute and relative motion: a reaction time study , 1994, Vision Research.

[73]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[74]  D. Hubel,et al.  Receptive fields of single neurones in the cat's striate cortex , 1959, The Journal of physiology.

[75]  J. Hurley,et al.  Shedding Light on Adaptation , 2002, The Journal of general physiology.

[76]  T. Lindeberg,et al.  Shape-adapted smoothing in estimation of 3-D depth cues from affine distortions of local 2-D structure , 1997 .

[77]  Jan J. Koenderink,et al.  Solid shape , 1990 .

[78]  Tony Lindeberg,et al.  Linear Spatio-Temporal Scale-Space , 1997, Scale-Space.

[79]  E. Rolls,et al.  View-invariant representations of familiar objects by neurons in the inferior temporal visual cortex. , 1998, Cerebral cortex.

[80]  Tony Lindeberg,et al.  Composed complex-cue histograms: An investigation of the information content in receptive field based image descriptors for object recognition , 2012, Comput. Vis. Image Underst..

[81]  Tony Lindeberg,et al.  Generalized Gaussian Scale-Space Axiomatics Comprising Linear Scale-Space, Affine Scale-Space and Spatio-Temporal Scale-Space , 2011, Journal of Mathematical Imaging and Vision.

[82]  Sadegh Abbasi,et al.  Affine Curvature Scale Space with Affine Length Parametrisation , 2014, Pattern Analysis & Applications.

[83]  S. Petersen,et al.  Direction-specific adaptation in area MT of the owl monkey , 1985, Brain Research.

[84]  Guillermo Sapiro,et al.  Affine invariant scale-space , 1993, International Journal of Computer Vision.

[85]  D. Coppola,et al.  Universality in the Evolution of Orientation Columns in the Visual Cortex , 2010, Science.

[86]  D G Stork,et al.  Do Gabor functions provide appropriate descriptions of visual cortical receptive fields? , 1990, Journal of the Optical Society of America. A, Optics and image science.

[87]  W. Geisler Visual perception and the statistical properties of natural scenes. , 2008, Annual review of psychology.

[88]  C. Stevens An evolutionary scaling law for the primate visual system and its basis in cortical function , 2001, Nature.

[89]  Adam Baumberg,et al.  Reliable feature matching across widely separated views , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[90]  Ronald M. Lesperance,et al.  The Gaussian derivative model for spatial-temporal vision: II. Cortical data. , 2001, Spatial vision.

[91]  J. Maunsell,et al.  Form representation in monkey inferotemporal cortex is virtually unaltered by free viewing , 2000, Nature Neuroscience.

[92]  E. Callaway Local circuits in primary visual cortex of the macaque monkey. , 1998, Annual review of neuroscience.

[93]  J. H. van Hateren,et al.  Modelling the Power Spectra of Natural Images: Statistics and Information , 1996, Vision Research.

[94]  J. P. Jones,et al.  The two-dimensional spatial structure of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[95]  A. Einstein Relativity: The Special and the General Theory , 2015 .

[96]  Amaury Nègre,et al.  Real-Time Time-to-Collision from Variation of Intrinsic Scale , 2006, ISER.

[97]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[98]  T. Lindeberg,et al.  Galilean-corrected spatio-temporal interest operators , 2004 .

[99]  J. J. Koenderink,et al.  Scale-time , 1988, Biological Cybernetics.

[100]  G. Orban,et al.  Speed and direction selectivity of macaque middle temporal neurons. , 1993, Journal of neurophysiology.

[101]  David D. Cox,et al.  Untangling invariant object recognition , 2007, Trends in Cognitive Sciences.

[102]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[103]  Tony Lindeberg,et al.  Principles for Automatic Scale Selection , 1999 .

[104]  R A Young,et al.  The Gaussian derivative model for spatial vision: I. Retinal mechanisms. , 1988, Spatial vision.

[105]  T. Lindeberg,et al.  Scale-Space Theory : A Basic Tool for Analysing Structures at Different Scales , 1994 .

[106]  Laurenz Wiskott,et al.  How Does Our Visual System Achieve Shift and Size Invariance , 2004 .

[107]  Luc Florack,et al.  Image Structure , 1997, Computational Imaging and Vision.