Development of a Bayesian Estimator for Audio-Visual Integration: A Neurocomputational Study

The brain integrates information from different sensory modalities to generate a coherent and accurate percept of external events. Several experimental studies suggest that this integration follows the principle of Bayesian estimate. However, the neural mechanisms responsible for this behavior, and its development in a multisensory environment, are still insufficiently understood. We recently presented a neural network model of audio-visual integration (Neural Computation, 2017) to investigate how a Bayesian estimator can spontaneously develop from the statistics of external stimuli. Model assumes the presence of two unimodal areas (auditory and visual) topologically organized. Neurons in each area receive an input from the external environment, computed as the inner product of the sensory-specific stimulus and the receptive field synapses, and a cross-modal input from neurons of the other modality. Based on sensory experience, synapses were trained via Hebbian potentiation and a decay term. Aim of this work is to improve the previous model, including a more realistic distribution of visual stimuli: visual stimuli have a higher spatial accuracy at the central azimuthal coordinate and a lower accuracy at the periphery. Moreover, their prior probability is higher at the center, and decreases toward the periphery. Simulations show that, after training, the receptive fields of visual and auditory neurons shrink to reproduce the accuracy of the input (both at the center and at the periphery in the visual case), thus realizing the likelihood estimate of unimodal spatial position. Moreover, the preferred positions of visual neurons contract toward the center, thus encoding the prior probability of the visual input. Finally, a prior probability of the co-occurrence of audio-visual stimuli is encoded in the cross-modal synapses. The model is able to simulate the main properties of a Bayesian estimator and to reproduce behavioral data in all conditions examined. In particular, in unisensory conditions the visual estimates exhibit a bias toward the fovea, which increases with the level of noise. In cross modal conditions, the SD of the estimates decreases when using congruent audio-visual stimuli, and a ventriloquism effect becomes evident in case of spatially disparate stimuli. Moreover, the ventriloquism decreases with the eccentricity.

[1]  M T Wallace,et al.  Development of Multisensory Neurons and Multisensory Integration in Cat Superior Colliculus , 1997, The Journal of Neuroscience.

[2]  A. Caramazza,et al.  Functional connectivity of visual cortex in the blind follows retinotopic organization principles , 2015, Brain : a journal of neurology.

[3]  Ione Fine,et al.  Resting-State Retinotopic Organization in the Absence of Retinal Input and Visual Experience , 2015, The Journal of Neuroscience.

[4]  Mauro Ursino,et al.  Multisensory Bayesian Inference Depends on Synapse Maturation during Training: Theoretical Analysis and Neural Modeling Implementation , 2017, Neural Computation.

[5]  G. DeAngelis,et al.  Neural correlates of multisensory cue integration in macaque MSTd , 2008, Nature Neuroscience.

[6]  Masakazu Konishi,et al.  Effects of Interaural Decorrelation on Neural and Behavioral Detection of Spatial Cues , 1998, Neuron.

[7]  Miguel P. Eckstein,et al.  Foveal analysis and peripheral selection during active visual sampling , 2014, Proceedings of the National Academy of Sciences.

[8]  A. Pouget,et al.  Probabilistic brains: knowns and unknowns , 2013, Nature Neuroscience.

[9]  G. F. Cooper,et al.  Development of the Brain depends on the Visual Environment , 1970, Nature.

[10]  Robert C. Froemke,et al.  Development of auditory cortical synaptic receptive fields , 2011, Neuroscience & Biobehavioral Reviews.

[11]  D. Burr,et al.  The Ventriloquist Effect Results from Near-Optimal Bimodal Integration , 2004, Current Biology.

[12]  Fanny Cazettes,et al.  Spatial cue reliability drives frequency tuning in the barn Owl's midbrain , 2014, eLife.

[13]  J. Kerr,et al.  Visual resolution in the periphery , 1971 .

[14]  José Luis Peña,et al.  Neural representation of probabilities for Bayesian inference , 2015, Journal of Computational Neuroscience.

[15]  Wei Ji Ma,et al.  Bayesian inference with probabilistic population codes , 2006, Nature Neuroscience.

[16]  Lothar Spillmann,et al.  Perceptive field size in fovea and periphery of the light- and dark-adapted retina , 1980, Vision Research.

[17]  M. Wallace,et al.  Unifying multisensory signals across time and space , 2004, Experimental Brain Research.

[18]  H. Leibowitz,et al.  Practice effects for visual resolution in the periphery , 1979, Perception & psychophysics.

[19]  Fanny Cazettes,et al.  Cue Reliability Represented in the Shape of Tuning Curves in the Owl's Sound Localization System , 2016, The Journal of Neuroscience.

[20]  Pete R. Jones,et al.  Development of Cue Integration in Human Navigation , 2008, Current Biology.

[21]  Mauro Ursino,et al.  Neurocomputational approaches to modelling multisensory integration in the brain: A review , 2014, Neural Networks.

[22]  D. Dacey The mosaic of midget ganglion cells in the human retina , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[23]  I. Rentschler,et al.  Peripheral vision and pattern recognition: a review. , 2011, Journal of vision.

[24]  Nadia Bolognini,et al.  A neurocomputational analysis of the sound-induced flash illusion , 2014, NeuroImage.

[25]  Katherine C. Wood,et al.  Relative sound localisation abilities in human listeners. , 2015, The Journal of the Acoustical Society of America.

[26]  Ulrik R Beierholm,et al.  Sound-induced flash illusion as an optimal percept , 2005, Neuroreport.

[27]  D. M. Green,et al.  Sound localization by human listeners. , 1991, Annual review of psychology.

[28]  A. Diederich,et al.  Why aren’t all deep superior colliculus neurons multisensory? A Bayes’ ratio analysis , 2004, Cognitive, Affective, & Behavioral Neuroscience.

[29]  W. Ma,et al.  Towards a neural implementation of causal inference in cue combination. , 2013, Multisensory research.

[30]  Ulrik R. Beierholm,et al.  Probability Matching as a Computational Strategy Used in Perception , 2010, PLoS Comput. Biol..

[31]  Ladan Shams,et al.  Early modulation of visual cortex by sound: an MEG study , 2005, Neuroscience Letters.

[32]  A. Pouget,et al.  Reading population codes: a neural implementation of ideal observers , 1999, Nature Neuroscience.

[33]  R D Freeman,et al.  Meridional amblyopia: evidence for modification of the human visual system by early visual experience. , 1973, Vision research.

[34]  R. Oehler Spatial interactions in the rhesus monkey retina: a behavioural study using the Westheimer paradigm , 2004, Experimental Brain Research.

[35]  Scott P. Johnson How Infants Learn About the Visual World , 2010, Cogn. Sci..

[36]  Franco Lepore,et al.  The ventriloquist in periphery: impact of eccentricity-related reliability on audio-visual localization. , 2013, Journal of vision.

[37]  Thomas D. Mrsic-Flogel,et al.  Experience-Dependent Specialization of Receptive Field Surround for Selective Coding of Natural Scenes , 2014, Neuron.

[38]  M. Rasch,et al.  Decentralized Multisensory Information Integration in Neural Systems , 2016, The Journal of Neuroscience.

[39]  Ladan Shams,et al.  Biases in Visual, Auditory, and Audiovisual Perception of Space , 2015, PLoS Comput. Biol..

[40]  M. Radeau,et al.  Erratum to: Cross-modal bias and perceptual fusion with auditoryvisual spatial discordance , 1981 .

[41]  Pete R. Jones,et al.  Auditory Localisation Biases Increase with Sensory Uncertainty , 2017, Scientific Reports.

[42]  Eero P. Simoncelli,et al.  Cardinal rules: Visual orientation perception reflects knowledge of environmental statistics , 2011, Nature Neuroscience.

[43]  D R Perrott,et al.  Minimum audible angle thresholds for sources varying in both elevation and azimuth. , 1990, The Journal of the Acoustical Society of America.

[44]  Christopher R Fetsch,et al.  Neural correlates of reliability-based cue weighting during multisensory integration , 2011, Nature Neuroscience.

[45]  G. Recanzone,et al.  The biological basis of audition. , 2008, Annual review of psychology.

[46]  R. Held,et al.  Visual acuity and its meridional variations in children aged 7–60 months , 1983, Vision Research.

[47]  A. Cowey,et al.  Human cortical magnification factor and its relation to visual acuity , 2004, Experimental Brain Research.

[48]  Jörg Lewald,et al.  Sound localization with eccentric head position , 2000, Behavioural Brain Research.

[49]  Thomas J. Anastasio,et al.  Modeling Cross-Modal Enhancement and Modality-Specific Suppression in Multisensory Neurons , 2003, Neural Computation.

[50]  M. Wallace,et al.  Visual Localization Ability Influences Cross-Modal Bias , 2003, Journal of Cognitive Neuroscience.

[51]  T. Rohe Causal inference in multisensory perception and the brain , 2014 .

[52]  R. Zemel,et al.  Inference and computation with population codes. , 2003, Annual review of neuroscience.

[53]  J. C. Middlebrooks Sound localization. , 2015, Handbook of clinical neurology.

[54]  E. Newport,et al.  Science Current Directions in Psychological Statistical Learning : from Acquiring Specific Items to Forming General Rules on Behalf Of: Association for Psychological Science , 2022 .

[55]  Brian J. Fischer,et al.  Owl's behavior and neural representation predicted by Bayesian inference , 2011, Nature Neuroscience.

[56]  David C. Burr,et al.  Young Children Do Not Integrate Visual and Haptic Form Information , 2008, Current Biology.

[57]  S. Shimojo,et al.  Illusions: What you see is what you hear , 2000, Nature.

[58]  Mauro Ursino,et al.  A neural network for learning the meaning of objects and words from a featural representation , 2015, Neural Networks.

[59]  Christopher R Fetsch,et al.  Dynamic Reweighting of Visual and Vestibular Cues during Self-Motion Perception , 2009, The Journal of Neuroscience.