Optimized but Not Maximized Cue Integration for 3D Visual Perception

Reconstructing three-dimensional (3D) scenes from two-dimensional (2D) retinal images is an ill-posed problem. Despite this, our 3D perception of the world based on 2D retinal images is seemingly accurate and precise. The integration of distinct visual cues is essential for robust 3D perception in humans, but it is unclear if this mechanism is conserved in non-human primates, and how the underlying neural architecture constrains 3D perception. Here we assess 3D perception in macaque monkeys using a surface orientation discrimination task. We find that perception is generally accurate, but precision depends on the spatial pose of the surface and available cues. The results indicate that robust perception is achieved by dynamically reweighting the integration of stereoscopic and perspective cues according to their pose-dependent reliabilities. They further suggest that 3D perception is influenced by a prior for the 3D orientation statistics of natural scenes. We compare the data to simulations based on the responses of 3D orientation selective neurons. The results are explained by a model in which two independent neuronal populations representing stereoscopic and perspective cues (with perspective signals from the two eyes combined using nonlinear canonical computations) are optimally integrated through linear summation. Perception of combined-cue stimuli is optimal given this architecture. However, an alternative architecture in which stereoscopic cues and perspective cues detected by each eye are represented by three independent populations yields two times greater precision than observed. This implies that, due to canonical computations, cue integration for 3D perception is optimized but not maximized. Author summary Our eyes only sense two-dimensional projections of the world (like a movie on a screen), yet we perceive the world in three dimensions. To create reliable 3D percepts, the human visual system integrates distinct visual signals according to their reliabilities, which depend on conditions such as how far away an object is located and how it is oriented. Here we find that non-human primates similarly integrate different 3D visual signals, and that their perception is influenced by the 3D orientation statistics of natural scenes. Cue integration is thus a conserved mechanism for creating robust 3D percepts by the primate brain. Using simulations of neural population activity, based on neuronal recordings from the same animals, we show that some computations which occur widely in the brain facilitate 3D perception, while others hinder perception. This work addresses key questions about how neural systems solve the difficult problem of generating 3D percepts, identifies a plausible neural architecture for implementing robust 3D vision, and reveals how neural computation can simultaneously optimize and curb perception.

[1]  Brian C. McCann,et al.  Estimating 3D tilt from local image cues in natural scenes , 2016, Journal of vision.

[2]  Eero P. Simoncelli,et al.  Cardinal rules: Visual orientation perception reflects knowledge of environmental statistics , 2011, Nature Neuroscience.

[3]  D Purves,et al.  The distribution of oriented contours in the real world. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Julian Leyland,et al.  The Southampton-York Natural Scenes (SYNS) dataset: Statistics of surface attitude , 2016, Scientific Reports.

[5]  Hiroshi Ban,et al.  Integration of texture and disparity cues to surface slant in dorsal visual cortex. , 2013, Journal of neurophysiology.

[6]  Zoe Kourtzi,et al.  Adaptive Estimation of Three-Dimensional Structure in the Human Brain , 2009, The Journal of Neuroscience.

[7]  Lowell W. Thompson,et al.  Contributions of binocular and monocular cues to motion-in-depth perception , 2019, Journal of vision.

[8]  J. Saunders,et al.  Do humans optimally integrate stereo and texture information for judgments of surface slant? , 2003, Vision Research.

[9]  G. Westheimer,et al.  Orientation dependency for foveal line stimuli: detection and intensity discrimination, resolution, orientation discrimination and Vernier acuity , 1998, Vision Research.

[10]  Thaddeus B. Czuba,et al.  Binocular Mechanisms of 3D Motion Processing. , 2017, Annual review of vision science.

[11]  Ari Rosenberg,et al.  Choice-Related Activity during Visual Slant Discrimination in Macaque CIP but Not V3A , 2018 .

[12]  Kent A. Stevens,et al.  Slant-tilt: The visual encoding of surface orientation , 1983, Biological Cybernetics.

[13]  A. Parker,et al.  Range and mechanism of encoding of horizontal disparity in macaque V1. , 2002, Journal of neurophysiology.

[14]  A. Tolias,et al.  Trial-to-trial, uncertainty-based adjustment of decision boundaries in visual categorization , 2013, Proceedings of the National Academy of Sciences.

[15]  James M. Hillis,et al.  Slant from texture and disparity cues: optimal cue combination. , 2004, Journal of vision.

[16]  Guillaume A Rousselet,et al.  Inverting faces elicits sensitivity to race on the N170 component: a cross-cultural study. , 2010, Journal of vision.

[17]  Nicholas I. Fisher,et al.  Statistical Analysis of Circular Data , 1993 .

[18]  Ari Rosenberg,et al.  The Visual Representation of 3D Object Orientation in Parietal Cortex , 2013, The Journal of Neuroscience.

[19]  F. A. Miles Binocular Vision and Stereopsis by Ian P. Howard and Brian J. Rogers, Oxford University Press, 1995. £90.00 (736 pages) ISBN 0 19 508476 4. , 1996, Trends in Neurosciences.

[20]  Ari Rosenberg,et al.  Reliability-dependent contributions of visual orientation cues in parietal cortex , 2014, Proceedings of the National Academy of Sciences.

[21]  H. Bülthoff,et al.  3D shape perception from combined depth cues in human visual cortex , 2005, Nature Neuroscience.

[22]  Ari Rosenberg,et al.  Models and processes of multisensory cue combination , 2014, Current Opinion in Neurobiology.

[23]  Johannes Burge,et al.  The lawful imprecision of human surface tilt estimation in natural scenes , 2017, bioRxiv.

[24]  Robert A. Barton,et al.  Binocularity and brain evolution in primates , 2004 .

[25]  Miriam Spering,et al.  Directional asymmetries in human smooth pursuit eye movements. , 2013, Investigative ophthalmology & visual science.

[26]  M. Landy,et al.  Weighted linear cue combination with possibly correlated error , 2003, Vision Research.

[27]  A. Fuchs Saccadic and smooth pursuit eye movements in the monkey , 1967, The Journal of physiology.

[28]  David C. Knill,et al.  Surface orientation from texture: ideal observers, generic observers and the information content of texture cues , 1998, Vision Research.

[29]  F. Campbell,et al.  The effect of orientation on the visual resolution of gratings , 1966, The Journal of physiology.

[30]  S. Morad,et al.  Ceramide-orchestrated signalling in cancer cells , 2012, Nature Reviews Cancer.

[31]  Su Keun Jeong,et al.  The impact of top-down spatial attention on laterality and hemispheric asymmetry in the human parietal cortex , 2016, Journal of vision.

[32]  A. Pouget,et al.  Marginalization in Neural Circuits with Divisive Normalization , 2011, The Journal of Neuroscience.

[33]  Richard F Murray,et al.  Cue combination on the circle and the sphere. , 2010, Journal of vision.

[34]  H. Sakata,et al.  Integration of perspective and disparity cues in surface-orientation-selective neurons of area CIP. , 2001, Journal of neurophysiology.

[35]  Ari Rosenberg,et al.  Gravity estimation and verticality perception. , 2018, Handbook of clinical neurology.

[36]  R. Fox,et al.  The computation of retinal disparity , 1985, Perception & psychophysics.

[37]  Mel W. Khaw,et al.  Normalization is a general neural mechanism for context-dependent decision making , 2013, Proceedings of the National Academy of Sciences.

[38]  Tomoka Naganuma,et al.  Neural Correlates for Perception of 3D Surface Orientation from Texture Gradient , 2002, Science.

[39]  R. Gregory The Most Expensive Painting in the World , 2007, Perception.

[40]  Alex R. Wade,et al.  Representation of Concurrent Stimuli by Population Activity in Visual Cortex , 2009, Neuron.

[41]  K. A. Stevens The information content of texture gradients , 1981, Biological Cybernetics.

[42]  Ari Rosenberg,et al.  Real-time experimental control using network-based parallel processing , 2018, bioRxiv.

[43]  Ari Rosenberg,et al.  Gravity Influences the Visual Representation of Object Tilt in Parietal Cortex , 2014, The Journal of Neuroscience.

[44]  Andrew E Welchman,et al.  Proscription supports robust perceptual integration by suppression in human visual cortex , 2018, Nature Communications.

[45]  Amy M. Ni,et al.  Tuned Normalization Explains the Size of Attention Modulations , 2012, Neuron.

[46]  D. Angelaki,et al.  A computational perspective on autism , 2015, Proceedings of the National Academy of Sciences.

[47]  M. Carandini,et al.  Normalization as a canonical neural computation , 2011, Nature Reviews Neuroscience.

[48]  Wei Ji Ma,et al.  Bayesian inference with probabilistic population codes , 2006, Nature Neuroscience.