Peripheral pooling is tuned to the localization task.

The human visual system exhibits substantially different properties between foveal and peripheral vision. Peripheral vision is special in that it has to compress data onto fewer units by reduced visual acuity and larger receptive fields, yielding greatly reduced performance on many tasks such as object recognition. However, here we show that the pooling operations implemented by peripheral vision provide exactly the invariance properties required by a self-localization task. We test the effect of different pooling sizes, as well as acuity reduction, on localization, object recognition, and scene categorization tasks. We find that peripheral pooling, but not reduced acuity, affects localization performance positively, whereas it is detrimental to object recognition performance.

[1]  Nicolas Pinto,et al.  Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[2]  D. Levi,et al.  Visual crowding: a fundamental limit on conscious perception and object recognition , 2011, Trends in Cognitive Sciences.

[3]  Thomas Reineking,et al.  From visual perception to place , 2009, Cognitive Processing.

[4]  D. Pelli,et al.  The uncrowded window of object recognition , 2008, Nature Neuroscience.

[5]  A. Torralba,et al.  The role of context in object recognition , 2007, Trends in Cognitive Sciences.

[6]  H. Wilson,et al.  Lateral interactions in peripherally viewed texture arrays. , 1997, Journal of the Optical Society of America. A, Optics, image science, and vision.

[7]  Tomaso Poggio,et al.  A hierarchical model of peripheral vision , 2011 .

[8]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[9]  J. Lund,et al.  Compulsory averaging of crowded orientation signals in human vision , 2001, Nature Neuroscience.

[10]  I. Rentschler,et al.  Peripheral vision and pattern recognition: a review. , 2011, Journal of vision.

[11]  Eero P. Simoncelli,et al.  A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients , 2000, International Journal of Computer Vision.

[12]  Paul F. Bulakowski,et al.  Reexamining the possible benefits of visual crowding: dissociating crowding from ensemble percepts , 2011, Attention, perception & psychophysics.

[13]  Ruth Rosenholtz,et al.  What your visual system sees where you are not looking , 2011, Electronic Imaging.

[14]  Lester C. Loschky,et al.  The limits of visual resolution in natural scene viewing , 2005 .

[15]  R. Rosenholtz,et al.  A summary statistic representation in peripheral vision explains visual search. , 2009, Journal of vision.

[16]  Denis G. Pelli,et al.  Substitution and pooling in crowding , 2011, Attention, perception & psychophysics.

[17]  David Whitney,et al.  The hierarchical sparse selection model of visual crowding , 2014, Front. Integr. Neurosci..

[18]  Lester C. Loschky,et al.  The contributions of central versus peripheral vision to scene gist recognition. , 2009, Journal of vision.

[19]  Jos B. T. M. Roerdink,et al.  A Neurophysiologically Plausible Population Code Model for Feature Integration Explains Visual Crowding , 2010, PLoS Comput. Biol..

[20]  Eric L. Schwartz,et al.  Computational Studies of the Spatial Architecture of Primate Visual Cortex , 1994 .

[21]  Krista A. Ehinger,et al.  Rethinking the Role of Top-Down Attention in Vision: Effects Attributable to a Lossy Representation in Peripheral Vision , 2011, Front. Psychology.

[22]  Abel G. Oliva,et al.  Gist of a scene , 2005 .

[23]  D. Ariely Seeing Sets: Representation by Statistical Properties , 2001, Psychological science.

[24]  F. W. Weymouth Visual sensory units and the minimal angle of resolution. , 1958, American journal of ophthalmology.

[25]  Thomas Serre,et al.  A feedforward architecture accounts for rapid categorization , 2007, Proceedings of the National Academy of Sciences.

[26]  M. Herzog,et al.  Crowding, grouping, and object recognition: A matter of appearance. , 2015, Journal of vision.

[27]  R Näsänen,et al.  Cortical magnification and peripheral vision. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[28]  D. Levi Crowding—An essential bottleneck for object recognition: A mini-review , 2008, Vision Research.

[29]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[30]  Alexei A. Efros,et al.  What makes Paris look like Paris? , 2015, Commun. ACM.

[31]  J. Rovamo,et al.  Visual resolution, contrast sensitivity, and the cortical magnification factor , 2004, Experimental Brain Research.

[32]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[33]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[34]  Nicolas Pinto,et al.  Comparing state-of-the-art visual features on invariant object recognition tasks , 2011, 2011 IEEE Workshop on Applications of Computer Vision (WACV).

[35]  Hugh R. Wilson,et al.  10 – THE PERCEPTION OF FORM: Retina to Striate Cortex , 1989 .

[36]  A. Bradley,et al.  Neural bandwidth of veridical perception across the visual field , 2016, Journal of vision.

[37]  Jitendra Malik,et al.  Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.

[38]  D M Green,et al.  Probability of being correct with 1 ofM orthogonal signals , 1991, Perception & psychophysics.

[39]  D. Pelli,et al.  Crowding is unlike ordinary masking: distinguishing feature integration from detection. , 2004, Journal of vision.

[40]  Cordelia Schmid,et al.  Dataset Issues in Object Recognition , 2006, Toward Category-Level Object Recognition.

[41]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[42]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[43]  Takayuki Ito,et al.  Neocognitron: A neural network model for a mechanism of visual pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[44]  H H Bülthoff,et al.  Detection of animals in natural images using far peripheral vision , 2001, The European journal of neuroscience.

[45]  Eero P. Simoncelli,et al.  Metamers of the ventral stream , 2011, Nature Neuroscience.

[46]  Matthias Bethge,et al.  Testing models of peripheral encoding using metamerism in an oddity paradigm. , 2016, Journal of vision.

[47]  D. Whitteridge,et al.  The representation of the visual field on the cerebral cortex in monkeys , 1961, The Journal of physiology.

[48]  S Anstis,et al.  Picturing Peripheral Acuity , 1998, Perception.

[49]  Sven Eberhardt,et al.  Low-level global features for vision-based localizations , 2013, KIK@KI.

[50]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[51]  S M Anstis,et al.  Letter: A chart demonstrating variations in acuity with retinal position. , 1974, Vision research.

[52]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[53]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).