Near-optimal combination of disparity across a log-polar scaled visual field

The human visual system is foveated: we can see fine spatial details in central vision, whereas resolution is poor in our peripheral visual field, and this loss of resolution follows an approximately logarithmic decrease. Additionally, our brain organizes visual input in polar coordinates. Therefore, the image projection occurring between retina and primary visual cortex can be mathematically described by the log-polar transform. Here, we test and model how this space-variant visual processing affects how we process binocular disparity, a key component of human depth perception. We observe that the fovea preferentially processes disparities at fine spatial scales, whereas the visual periphery is tuned for coarse spatial scales, in line with the naturally occurring distributions of depths and disparities in the real-world. We further show that the visual field integrates disparity information across the visual field, in a near-optimal fashion. We develop a foveated, log-polar model that mimics the processing of depth information in primary visual cortex and that can process disparity directly in the cortical domain representation. This model takes real images as input and recreates the observed topography of disparity sensitivity in man. Our findings support the notion that our foveated, binocular visual system has been moulded by the statistics of our visual environment. Author summary We investigate how humans perceive depth from binocular disparity at different spatial scales and across different regions of the visual field. We show that small changes in disparity-defined depth are detected best in central vision, whereas peripheral vision best captures the coarser structure of the environment. We also demonstrate that depth information extracted from different regions of the visual field is combined into a unified depth percept. We then construct an image-computable model of disparity processing that takes into account how our brain organizes the visual input at our retinae. The model operates directly in cortical image space, and neatly accounts for human depth perception across the visual field.

[1]  Robert F. Hess,et al.  Mechanisms underlying global stereopsis in fovea and periphery , 2013, Vision Research.

[2]  Manuela Chessa,et al.  Simulated disparity and peripheral blur interact during binocular fusion. , 2014, Journal of vision.

[3]  Andrew Blake,et al.  Shape from texture: Ideal observers and human psychophysics , 1993, Vision Research.

[4]  Jan Theeuwes,et al.  Location-based effects underlie feature conjunction benefits in visual working memory. , 2016, Journal of vision.

[5]  Robert S. Allison,et al.  Coarse-fine dichotomies in human stereopsis , 2009, Vision Research.

[6]  J. Robson,et al.  Application of fourier analysis to the visibility of gratings , 1968, The Journal of physiology.

[7]  Ryan V. Ringer,et al.  Impairing the useful field of view in natural scenes: Tunnel vision versus general interference. , 2016, Journal of vision.

[8]  Hans Knutsson,et al.  Signal processing for computer vision , 1994 .

[9]  Manuela Chessa,et al.  A Quantitative Comparison of Speed and Reliability for Log-Polar Mapping Techniques , 2011, ICVS.

[10]  Hermann Wagner,et al.  Disparity sensitivity in man and owl: Psychophysical evidence for equivalent perception of shape-from-stereo. , 2011, Journal of vision.

[11]  S Marcelja,et al.  Mathematical description of the responses of simple cortical cells. , 1980, Journal of the Optical Society of America.

[12]  Andrea Canessa,et al.  The Active Side of Stereopsis: Fixation Strategy and Adaptation to Natural Environments , 2017, Scientific Reports.

[13]  Manuela Chessa,et al.  Descriptor : A dataset of stereoscopic images and ground-truth disparity mimicking human fi xations in peripersonal space , 2017 .

[14]  Ignacio Serrano-Pedraza,et al.  Stereo vision requires an explicit encoding of vertical disparity. , 2009, Journal of vision.

[15]  Filiberto Pla,et al.  Log-polar mapping template design: From task-level requirements to geometry parameters , 2008, Image Vis. Comput..

[16]  R. J. van Beers,et al.  Integration of proprioceptive and visual position-information: An experimentally supported model. , 1999, Journal of neurophysiology.

[17]  Alexandre Bernardino,et al.  Smooth Foveal vision with Gaussian receptive fields , 2009, 2009 9th IEEE-RAS International Conference on Humanoid Robots.

[18]  Miguel Castelo-Branco,et al.  The linear impact of concurrent working memory load on dynamics of Necker cube perceptual reversals. , 2014, Journal of vision.

[19]  Manuela Chessa,et al.  Design strategies for direct multi-scale and multi-orientation feature extraction in the log-polar domain , 2012, Pattern Recognit. Lett..

[20]  Jenny C. A. Read Vertical Binocular Disparity is Encoded Implicitly within a Model Neuronal Population Tuned to Horizontal Disparity and Orientation , 2010, PLoS Comput. Biol..

[21]  David C. Burr,et al.  Space-time in the brain , 2010 .

[22]  D G Pelli,et al.  The VideoToolbox software for visual psychophysics: transforming numbers into movies. , 1997, Spatial vision.

[23]  D. Scharstein,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[24]  Ennio Mingolla,et al.  Modeling a space-variant cortical representation for apparent motion. , 2013, Journal of vision.

[25]  M. Banks,et al.  Estimator Reliability and Distance Scaling in Stereoscopic Slant Perception , 1999, Perception.

[26]  Simon J. D. Prince,et al.  PII: S0042-6989(98)00118-7 , 1998 .

[27]  A. Parker,et al.  Computing stereo channels from masking data , 1997, Vision Research.

[28]  Richard Szeliski,et al.  High-accuracy stereo depth maps using structured light , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[29]  Peter E. Latham,et al.  Statistically Efficient Estimation Using Population Coding , 1998, Neural Computation.

[30]  Hanspeter A. Mallot,et al.  Neural mapping and space-variant image processing , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[31]  B. Julesz,et al.  Independent Spatial-Frequency-Tuned Channels in Binocular Fusion and Rivalry , 1975 .

[32]  Eero P. Simoncelli Coarse-to-fine estimation of visual motion , 1993 .

[33]  Bruce G Cumming,et al.  Does depth perception require vertical-disparity detectors? , 2006, Journal of vision.

[34]  D. Bradley,et al.  Structure and function of visual area MT. , 2005, Annual review of neuroscience.

[35]  Ning Qian,et al.  Physiological computation of binocular disparity , 1997, Vision Research.

[36]  Emily A. Cooper,et al.  Blur and Disparity Are Complementary Cues to Depth , 2012, Current Biology.

[37]  Alexandre Bernardino,et al.  A review of log-polar imaging for visual perception in robotics , 2010, Robotics and Autonomous Systems.

[38]  Andrew M. Wallace,et al.  Gradient detection in discrete log-polar images , 2003, Pattern Recognit. Lett..

[39]  M. Goodale,et al.  An evolving view of duplex vision: separate but interacting cortical pathways for perception and action , 2004, Current Opinion in Neurobiology.

[40]  Manuela Chessa,et al.  The (In)Effectiveness of Simulated Blur for Depth Perception in Naturalistic Images , 2015, PloS one.

[41]  J. Daugman Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[42]  I. Ohzawa,et al.  Stereoscopic depth discrimination in the visual cortex: neurons ideally suited as disparity detectors. , 1990, Science.

[43]  Eero P. Simoncelli,et al.  Metamers of the ventral stream , 2011, Nature Neuroscience.

[44]  Yi Gao,et al.  A normative dataset on human global stereopsis using the quick Disparity Sensitivity Function (qDSF) , 2015, Vision Research.

[45]  A. Bradley,et al.  Neural bandwidth of veridical perception across the visual field , 2016, Journal of vision.

[46]  Konrad Schindler Geometry and construction of straight lines in log-polar images , 2006, Comput. Vis. Image Underst..

[47]  Ignacio Serrano-Pedraza,et al.  Multiple channels for horizontal, but only one for vertical corrugations? A new look at the stereo anisotropy. , 2010, Journal of vision.

[48]  Christopher W. Tyler,et al.  Computational reconstruction of the mechanisms of human stereopsis , 1994, Other Conferences.

[49]  George Sperling,et al.  The perceived motion direction of fast-moving Type-II plaids , 2010 .

[50]  Manuela Chessa,et al.  Local Feature Extraction in Log-Polar Images , 2015, ICIAP.

[51]  Martin D. Levine,et al.  A Real-Time Foveated Sensor with Overlapping Receptive Fields , 1997, Real Time Imaging.

[52]  D H Brainard,et al.  The Psychophysics Toolbox. , 1997, Spatial vision.

[53]  Jiawei Zhou,et al.  Similar Mechanisms Underlie the Detection of Horizontal and Vertical Disparity Corrugations , 2014, PloS one.

[54]  Paul R. Schrater,et al.  How Optimal Depth Cue Integration Depends on the Task , 2000, International Journal of Computer Vision.

[55]  P. Artal,et al.  The human eye is an example of robust optical design. , 2006, Journal of vision.

[56]  R. Blake,et al.  Spatial frequency tuning of human stereopsis , 1991, Vision Research.

[57]  Giulio Sandini,et al.  Anthropomorphic Visual Sensors , 2005 .

[58]  E. Switkes,et al.  Deoxyglucose analysis of retinotopic organization in primate striate cortex. , 1982, Science.

[59]  Anthony M. Norcia,et al.  Electrophysiological evidence for the existence of coarse and fine disparity mechanisms in human , 1985, Vision Research.

[60]  Kamiar Rahnama Rad,et al.  Information Rates and Optimal Decoding in Large Neural Populations , 2011, NIPS.

[61]  David J. Fleet,et al.  Neural encoding of binocular disparity: Energy models, position shifts and phase shifts , 1996, Vision Research.

[62]  Jerry D. Nguyenkim,et al.  Disparity-Based Coding of Three-Dimensional Surface Orientation by Macaque Middle Temporal Neurons , 2003, The Journal of Neuroscience.

[63]  J. Gallant,et al.  A Three-Dimensional Spatiotemporal Receptive Field Model Explains Responses of Area MT Neurons to Naturalistic Movies , 2011, The Journal of Neuroscience.

[64]  Yang Liu,et al.  Disparity statistics in natural scenes. , 2008, Journal of vision.

[65]  Manuela Chessa,et al.  A Virtual Reality Simulator for Active Stereo Vision Systems , 2009, VISAPP.

[66]  David C. Knill Discrimination of planar surface slant from texture: human and ideal observers compared , 1998, Vision Research.

[67]  C. Tyler Spatial organization of binocular disparity sensitivity , 1975, Vision Research.

[68]  J. Rovamo,et al.  Visual resolution, contrast sensitivity, and the cortical magnification factor , 2004, Experimental Brain Research.

[69]  Olaf Sporns,et al.  Mapping Information Flow in Sensorimotor Networks , 2006, PLoS Comput. Biol..

[70]  M. Landy,et al.  Measurement and modeling of depth cue combination: in defense of weak fusion , 1995, Vision Research.

[71]  Jenny Read,et al.  Spatial Stereoresolution for Depth Corrugations May Be Set in Primary Visual Cortex , 2011, BMC Neuroscience.

[72]  Mark F. Bradshaw,et al.  Sensitivity to horizontal and vertical corrugations defined by binocular disparity , 1999, Vision Research.

[73]  Giorgio Bonmassar,et al.  Space-Variant Fourier Analysis: The Exponential Chirp Transform , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[74]  Manuela Chessa,et al.  A Computational Model for the Neural Representation and Estimation of the Binocular Vector Disparity from Convergent Stereo Image Pairs , 2019, Int. J. Neural Syst..

[75]  G. B. Wetherill,et al.  SEQUENTIAL ESTIMATION OF POINTS ON A PSYCHOMETRIC FUNCTION. , 1965, The British journal of mathematical and statistical psychology.

[76]  Annelies Baeck,et al.  Transfer of object learning across distinct visual learning paradigms. , 2010, Journal of vision.

[77]  Zhong-Lin Lu,et al.  Bayesian adaptive estimation of the contrast sensitivity function: the quick CSF method. , 2010, Journal of vision.

[78]  Daniel De Kee,et al.  Advanced Mathematics for Applied and Pure Sciences , 1997 .

[79]  Edward H. Adelson,et al.  PYRAMID METHODS IN IMAGE PROCESSING. , 1984 .

[80]  Manuela Chessa,et al.  A space-variant model for motion interpretation across the visual field. , 2016, Journal of vision.

[81]  Robert F. Hess,et al.  Characterization of Spatial Frequency Channels Underlying Disparity Sensitivity by Factor Analysis of Population Data , 2017, Front. Comput. Neurosci..

[82]  Bruce G. Cumming,et al.  A Single Mechanism Can Account for Human Perception of Depth in Mixed Correlation Random Dot Stereograms , 2016, PLoS Comput. Biol..

[83]  Eero P. Simoncelli,et al.  How MT cells analyze the motion of visual patterns , 2006, Nature Neuroscience.

[84]  Kenneth Pulliam,et al.  Spatial Frequency Analysis Of Three-Dimensional Vision , 1982, Optics & Photonics.

[85]  E. L. Schwartz,et al.  Spatial mapping in the primate sensory projection: Analytic structure and relevance to perception , 1977, Biological Cybernetics.

[86]  Manuela Chessa,et al.  An integrated neuromimetic architecture for direct motion interpretation in the log-polar domain , 2014, Comput. Vis. Image Underst..

[87]  M. Ernst,et al.  Humans integrate visual and haptic information in a statistically optimal fashion , 2002, Nature.

[88]  Michael Breakspear,et al.  Modeling Magnification and Anisotropy in the Primate Foveal Confluence , 2010, PLoS Comput. Biol..

[89]  Kathy T Mullen,et al.  Blobs versus bars: psychophysical evidence supports two types of orientation response in human color vision. , 2013, Journal of vision.