Human peripheral blur is optimal for object recognition

Our vision is sharpest at the center of our gaze and becomes progressively blurry into the periphery. It is widely believed that this high foveal resolution evolved at the expense of peripheral acuity. But what if this sampling scheme is actually optimal for object recognition? To test this hypothesis, we trained deep neural networks on 'foveated' images with high resolution near objects and increasingly sparse sampling into the periphery. Neural networks trained using a blur profile matching the human eye yielded the best performance compared to shallower and steeper blur profiles. Even in humans, categorization accuracy deteriorated only for steeper blur profiles. Thus, our blurry peripheral vision may have evolved to optimize object recognition rather than merely due to wiring constraints.

[1]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[2]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[3]  D H Brainard,et al.  The Psychophysics Toolbox. , 1997, Spatial vision.

[4]  Alan L. Yuille,et al.  Object Recognition with and without Objects , 2016, IJCAI.

[5]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[6]  Jochen Triesch,et al.  Implementations and Implications of Foveated Vision , 2009 .

[7]  A. Hendrickson,et al.  Human photoreceptor topography , 1990, The Journal of comparative neurology.

[8]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[9]  M. Bar Visual objects in context , 2004, Nature Reviews Neuroscience.

[10]  Bruno A. Olshausen,et al.  Emergence of foveal image sampling from learning to attend in visual scenes , 2016, ICLR.

[11]  Harish Katti,et al.  How do targets, nontargets, and scene context influence real-world object detection? , 2017, Attention, Perception, & Psychophysics.

[12]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[13]  Jodi L. Davenport,et al.  Scene Consistency in Object and Background Perception , 2004, Psychological science.

[14]  C. Curcio,et al.  Topography of ganglion cells in human retina , 1990, The Journal of comparative neurology.

[15]  Carlos R. Ponce,et al.  Evolving Images for Visual Neurons Using a Deep Generative Network Reveals Coding Principles and Neuronal Preferences , 2019, Cell.

[16]  P. Perona,et al.  Rapid natural scene categorization in the near absence of attention , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[18]  Wilson S. Geisler,et al.  Real-time foveated multiresolution system for low-bandwidth video communication , 1998, Electronic Imaging.

[19]  P. Schyns,et al.  Usage of spatial scales for the categorization of faces, objects, and scenes , 2001, Psychonomic bulletin & review.

[20]  Miguel P. Eckstein,et al.  Object detection through search with a foveated visual system , 2014, PLoS Comput. Biol..

[21]  Harish Katti,et al.  Machine vision benefits from human contextual expectations , 2019, Scientific Reports.

[22]  J. Robson,et al.  Application of fourier analysis to the visibility of gratings , 1968, The Journal of physiology.

[23]  Antonio Torralba,et al.  Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.