THE COMPUTATIONAL MAGIC OF THE VENTRAL STREAM: TOWARDS A THEORY

I conjecture that the sample complexity of object recognition is mostly due to geo- metric image transformations and that a main goal of the ventral stream - V1, V2, V4 and IT - is to learn-and-discount image transformations. The most surprising implication of the theory emerging from these assumptions is that the computational goals and detailed properties of cells in the ventral stream follow from symmetry properties of the visual world through a process of unsupervised correlational learning. From the assumption of a hierarchy of areas with receptive fields of increasing size the the- ory predicts that the size of the receptive fields determines which transformations are learned during development and then factored out during normal processing; that the transformation represented in each area determines the tuning of the neurons in the aerea, independently of the statistics of natural images; and that class-specific transformations are learned and represented at the top of the ventral stream hierarchy. Some of the main predictions of this theory-in-fieri are: the type of transformation that are learned from visual experience depend on the size (mea- sured in terms of wavelength) and thus on the area (layer in the models) - assuming that the aperture size increases with layers; the mix of transformations learned determine the properties of the receptive fields - oriented bars in V1+V2, radial and spiral patterns in V4 up to class specific tuning in AIT (eg face tuned cells); class-specific modules - such as faces, places and possibly body areas - should exist in IT to process images of object classes.

[1]  Ronen Basri,et al.  Recognition by Linear Combinations of Models , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Rajesh P. N. Rao,et al.  Learning Lie Groups for Invariant Visual Perception , 1998, NIPS.

[3]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[4]  Tomaso Poggio,et al.  Models of object recognition , 2000, Nature Neuroscience.

[5]  Charles F Stevens Preserving properties of object shape by computations in primary visual cortex. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Thomas Serre,et al.  A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex , 2005 .

[7]  Thomas Serre,et al.  A feedforward architecture accounts for rapid categorization , 2007, Proceedings of the National Academy of Sciences.

[8]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Geoffrey E. Hinton,et al.  Learning to Represent Spatial Transformations with Factored Higher-Order Boltzmann Machines , 2010, Neural Computation.

[10]  Tomaso Poggio,et al.  From primal templates to invariant recognition , 2010 .

[11]  Lorenzo Rosasco,et al.  Publisher Accessed Terms of Use Detailed Terms Mathematics of the Neural Response , 2022 .

[12]  Joel Z. Leibo,et al.  Learning Generic Invariances in Object Recognition: Translation and Scale , 2010 .

[13]  Tomaso Poggio,et al.  Learning to discount transformations as the computational goal of visual cortex , 2011 .