Resolving multisensory conflict: a strategy for balancing the costs and benefits of audio-visual integration

In order to maintain a coherent, unified percept of the external environment, the brain must continuously combine information encoded by our different sensory systems. Contemporary models suggest that multisensory integration produces a weighted average of sensory estimates, where the contribution of each system to the ultimate multisensory percept is governed by the relative reliability of the information it provides (maximum-likelihood estimation). In the present study, we investigate interactions between auditory and visual rate perception, where observers are required to make judgments in one modality while ignoring conflicting rate information presented in the other. We show a gradual transition between partial cue integration and complete cue segregation with increasing inter-modal discrepancy that is inconsistent with mandatory implementation of maximum-likelihood estimation. To explain these findings, we implement a simple Bayesian model of integration that is also able to predict observer performance with novel stimuli. The model assumes that the brain takes into account prior knowledge about the correspondence between auditory and visual rate signals, when determining the degree of integration to implement. This provides a strategy for balancing the benefits accrued by integrating sensory estimates arising from a common source, against the costs of conflating information relating to independent objects or events.

[1]  Mowbray Gh,et al.  On discriminating the rate of visual flicker and auditory flutter. , 1959 .

[2]  G H MOWBRAY,et al.  On discriminating the rate of visual flicker and auditory flutter. , 1959, The American journal of psychology.

[3]  T SHIPLEY,et al.  Auditory Flutter-Driving of Visual Flicker , 1964, Science.

[4]  D. H. Warren,et al.  Sensory conflict in judgments of spatial direction , 1969 .

[5]  D. H. Warren,et al.  Visual-proprioceptive interaction under large amounts of conflict. , 1971, Journal of experimental psychology.

[6]  W R Thurlow,et al.  Effects of degree of visual association and angle of displacement on the "ventriloquism" effect. , 1973, Perceptual and motor skills.

[7]  D. H. Warren,et al.  Immediate perceptual response to intersensory discrepancy. , 1980, Psychological bulletin.

[8]  A. K. Myers,et al.  Matching the rate of concurrent tone bursts and light flashes as a function of flash surround luminance , 1981, Perception & psychophysics.

[9]  P. Bertelson,et al.  Cross-modal bias and perceptual fusion with auditory-visual spatial discordance , 1981, Perception & psychophysics.

[10]  Robert B. Welch,et al.  Contributions of audition and vision to temporal rate perception , 1986, Perception & psychophysics.

[11]  James J. Clark,et al.  Data Fusion for Sensory Information Processing Systems , 1990 .

[12]  M. Landy,et al.  Measurement and modeling of depth cue combination: in defense of weak fusion , 1995, Vision Research.

[13]  E. Bullmore,et al.  Activation of auditory cortex during silent lipreading. , 1997, Science.

[14]  Simon K. Rushton,et al.  Weighted combination of size and disparity: a computational model for timing a ball catch , 1999, Nature Neuroscience.

[15]  R. J. van Beers,et al.  Integration of proprioceptive and visual position-information: An experimentally supported model. , 1999, Journal of neurophysiology.

[16]  R. Jacobs,et al.  Optimal integration of texture and motion cues to depth , 1999, Vision Research.

[17]  C. Frith,et al.  Modulation of human visual cortex by crossmodal spatial attention. , 2000, Science.

[18]  C. Spence,et al.  Multisensory perception: Beyond modularity and convergence , 2000, Current Biology.

[19]  S. Shimojo,et al.  Illusions: What you see is what you hear , 2000, Nature.

[20]  M S Landy,et al.  Ideal cue combination for localizing texture-defined edges. , 2001, Journal of the Optical Society of America. A, Optics, image science, and vision.

[21]  C. Schroeder,et al.  Somatosensory input to auditory association cortex in the macaque monkey. , 2001, Journal of neurophysiology.

[22]  M. Ernst,et al.  Humans integrate visual and haptic information in a statistically optimal fashion , 2002, Nature.

[23]  H. Kennedy,et al.  Anatomical Evidence of Multimodal Integration in Primate Striate Cortex , 2002, The Journal of Neuroscience.

[24]  James M. Hillis,et al.  Combining Sensory Information: Mandatory Fusion Within, but Not Between, Senses , 2002, Science.

[25]  M. Meredith,et al.  On the neuronal basis for multisensory convergence: a brief overview. , 2002, Brain research. Cognitive brain research.

[26]  D. Wolpert,et al.  When Feeling Is More Important Than Seeing in Sensorimotor Adaptation , 2002, Current Biology.

[27]  Robert A Jacobs,et al.  Bayesian integration of visual and auditory signals for spatial localization. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[28]  Kathleen S Rockland,et al.  Multisensory convergence in calcarine visual areas in macaque monkey. , 2003, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[29]  S. Gepshtein,et al.  Viewing Geometry Determines How Vision and Haptics Combine in Size Perception , 2003, Current Biology.

[30]  J. Saunders,et al.  Do humans optimally integrate stereo and texture information for judgments of surface slant? , 2003, Vision Research.

[31]  G. Recanzone Auditory influences on visual temporal rate perception. , 2003, Journal of neurophysiology.

[32]  Ankoor S. Shah,et al.  Auditory Cortical Neurons Respond to Somatosensory Stimulation , 2003, The Journal of Neuroscience.

[33]  M. Ernst,et al.  Feeling what you hear: auditory signals can modulate tactile tap perception , 2005, Experimental Brain Research.

[34]  D. Burr,et al.  The Ventriloquist Effect Results from Near-Optimal Bimodal Integration , 2004, Current Biology.

[35]  D. Knill,et al.  The Bayesian brain: the role of uncertainty in neural coding and computation , 2004, Trends in Neurosciences.

[36]  James M. Hillis,et al.  Slant from texture and disparity cues: optimal cue combination. , 2004, Journal of vision.

[37]  H. Bülthoff,et al.  Merging the senses into a robust percept , 2004, Trends in Cognitive Sciences.

[38]  M. Ernst,et al.  Experience can change the 'light-from-above' prior , 2004, Nature Neuroscience.

[39]  Ilana B. Witten,et al.  Why Seeing Is Believing: Merging Auditory and Visual Worlds , 2005, Neuron.

[40]  Ulrik R Beierholm,et al.  Sound-induced flash illusion as an optimal percept , 2005, Neuroreport.

[41]  S. Gepshtein,et al.  The combination of vision and touch depends on spatial proximity. , 2005, Journal of vision.

[42]  Marc O. Ernst,et al.  A Bayesian view on multimodal cue integration , 2006 .