Active Object Recognition with a Space-Variant Retina

When independent component analysis (ICA) is applied to color natural images, the representation it learns has spatiochromatic properties similar to the responses of neurons in primary visual cortex. Existing models of ICA have only been applied to pixel patches. This does not take into account the space-variant nature of human vision. To address this, we use the space-variant log-polar transformation to acquire samples from color natural images, and then we apply ICA to the acquired samples. We analyze the spatiochromatic properties of the learned ICA filters. Qualitatively, the model matches the receptive field properties of neurons in primary visual cortex, including exhibiting the same opponent-color structure and a higher density of receptive fields in the foveal region compared to the periphery. We also adopt the “self-taught learning” paradigm from machine learning to assess the model’s efficacy at active object and face classification, and the model is competitive with the best approaches in computer vision.

[1]  B. Fischer,et al.  Visual field representations and locations of visual areas V1/2/3 in human visual cortex. , 2003, Journal of vision.

[2]  G. Glover,et al.  Retinotopic organization in human visual cortex and the spatial precision of functional MRI. , 1997, Cerebral cortex.

[3]  Garrison W. Cottrell,et al.  Looking around the backyard helps to recognize faces and digits , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[5]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[6]  Marc'Aurelio Ranzato,et al.  Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Weiguo Gong,et al.  Uncorrelated linear discriminant analysis based on weighted pairwise Fisher criterion , 2007, Pattern Recognit..

[8]  R. Masland The fundamental plan of the retina , 2001, Nature Neuroscience.

[9]  Garrison W. Cottrell,et al.  Robust classification of objects, faces, and flowers using natural image statistics , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  T. Sejnowski,et al.  Color opponency is an efficient representation of spectral properties in natural scenes , 2002, Vision Research.

[11]  C. Curcio,et al.  Topography of ganglion cells in human retina , 1990, The Journal of comparative neurology.

[12]  T. Sejnowski,et al.  Cone selectivity derived from the responses of the retinal cone mosaic to natural scenes. , 2007, Journal of vision.

[13]  Manik Varma,et al.  Learning The Discriminative Power-Invariance Trade-Off , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[14]  D. Whitteridge,et al.  The representation of the visual field on the cerebral cortex in monkeys , 1961, The Journal of physiology.

[15]  Garrison W. Cottrell,et al.  Color-to-Grayscale: Does the Method Matter in Image Recognition? , 2012, PloS one.

[16]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[17]  Nicolas Pinto,et al.  Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[18]  Andriana Olmos,et al.  A biologically inspired algorithm for the recovery of shading and reflectance images , 2004 .

[19]  Geoffrey E. Hinton,et al.  Learning to combine foveal glimpses with a third-order Boltzmann machine , 2010, NIPS.

[20]  Manuela Chessa,et al.  A Quantitative Comparison of Speed and Reliability for Log-Polar Mapping Techniques , 2011, ICVS.

[21]  Garrison W. Cottrell,et al.  Color Constancy Algorithms for Object and Face Recognition , 2010, ISVC.

[22]  J. Konorski Integrative activity of the brain , 1967 .

[23]  Alexandre Bernardino,et al.  A review of log-polar imaging for visual perception in robotics , 2010, Robotics and Autonomous Systems.

[24]  Martin D. Levine,et al.  A Review of Biologically Motivated Space-Variant Data Reduction Models for Robotic Vision , 1998, Comput. Vis. Image Underst..

[25]  Aleix M. Martinez,et al.  The AR face database , 1998 .

[26]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[27]  Christopher Kanan,et al.  Recognizing Sights, Smells, and Sounds with Gnostic Fields , 2013, PloS one.

[28]  Martin D. Levine,et al.  A Real-Time Foveated Sensor with Overlapping Receptive Fields , 1997, Real Time Imaging.

[29]  Tomaso A. Poggio,et al.  A Canonical Neural Circuit for Cortical Nonlinear Operations , 2008, Neural Computation.

[30]  David J Tolhurst,et al.  Independent components of color natural scenes resemble V1 neurons in their spatial and color tuning. , 2004, Journal of neurophysiology.

[31]  Inderjit S. Dhillon,et al.  Concept Decompositions for Large Sparse Text Data Using Clustering , 2004, Machine Learning.

[32]  Neil A. Dodgson,et al.  Decolorize: Fast, contrast enhancing, color to grayscale conversion , 2007, Pattern Recognit..

[33]  E. L. Schwartz,et al.  Spatial mapping in the primate sensory projection: Analytic structure and relevance to perception , 1977, Biological Cybernetics.

[34]  C. Gross,et al.  Visual topography of V2 in the macaque , 1981, The Journal of comparative neurology.

[35]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[36]  R. Baddeley,et al.  Is the early visual system optimised to be energy efficient? , 2005, Network.

[37]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[38]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[39]  P. Tichavský,et al.  Efficient variant of algorithm fastica for independent component analysis attaining the cramer-RAO lower bound , 2005, IEEE/SP 13th Workshop on Statistical Signal Processing, 2005.

[40]  Lorenzo Torresani,et al.  Meta-class features for large-scale object categorization on a budget , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Erkki Oja,et al.  Efficient Variant of Algorithm FastICA for Independent Component Analysis Attaining the CramÉr-Rao Lower Bound , 2006, IEEE Transactions on Neural Networks.

[42]  J. P. Jones,et al.  An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.