Visual cues can modulate integration and segregation of objects in auditory scene analysis

The task of assigning concurrent sounds to different auditory objects is known to depend on temporal and spectral cues. When tones of high and low frequencies are presented in alternation, they can be perceived as a single, integrated melody, or as two parallel, segregated melodic lines, according to the presentation rate and frequency distance between the sounds. At an intermediate distance, in the 'ambiguous' range, both percepts are possible. We conducted an electrophysiological experiment to determine whether an ambiguous sound organization could be modulated toward an integrated or segregated percept by the synchronous presentation of visual cues. Two sets of sounds (one high frequency and one low frequency) were interleaved. To promote integration or segregation, visual stimuli were synchronized to either the within-set frequency pattern or to the across-set intensity pattern. Elicitation of the mismatch negativity (MMN) component of event-related brain potentials was used to index the segregated organization, when no task was performed with the sounds. MMN was elicited only when the visual pattern promoted the segregation of the sounds. The results demonstrate cross-modal effects on auditory object perception in that sound ambiguity was resolved by synchronous presentation of visual stimuli, which promoted either an integrated or segregated perception of the sounds.

[1]  K. Alho Cerebral Generators of Mismatch Negativity (MMN) and Its Magnetic Counterpart (MMNm) Elicited by Sound Changes , 1995, Ear and hearing.

[2]  Geraint Rees,et al.  Sound alters activity in human V1 in association with illusory visual perception , 2006, NeuroImage.

[3]  I. Winkler,et al.  ‘Primitive intelligence’ in the auditory cortex , 2001, Trends in Neurosciences.

[4]  A. Bregman,et al.  Attentional modulation of electrophysiological activity in auditory cortex for unattended sounds within multistream auditory environments , 2005, Cognitive, affective & behavioral neuroscience.

[5]  P. Bertelson,et al.  The ventriloquist effect does not depend on the direction of deliberate visual attention , 2000, Perception & psychophysics.

[6]  Yoshitaka Nakajima,et al.  Audiovisual integration: an investigation of the "streaming-bouncing" phenomenon. , 2004, Journal of physiological anthropology and applied human science.

[7]  M. Giard,et al.  Auditory-Visual Integration during Multimodal Object Recognition in Humans: A Behavioral and Electrophysiological Study , 1999, Journal of Cognitive Neuroscience.

[8]  István Czigler,et al.  ERPs and deviance detection: Visual mismatch negativity to repeated visual stimuli , 2006, Neuroscience Letters.

[9]  S. Shimojo,et al.  When Sound Affects Vision: Effects of Auditory Grouping on Visual Motion Perception , 2001, Psychological science.

[10]  A. Fort,et al.  Is the auditory sensory memory sensitive to visual information? , 2005, Experimental Brain Research.

[11]  Shozo Tobimatsu,et al.  Functional characterization of mismatch negativity to a visual stimulus , 2005, Clinical Neurophysiology.

[12]  I. Winkler,et al.  Organizing sound sequences in the human brain: the interplay of auditory streaming and temporal integration 1 1 Published on the World Wide Web on 27 February 2001. , 2001, Brain Research.

[13]  F. Perrin,et al.  Brain generators implicated in the processing of auditory stimulus deviance: a topographic event-related potential study. , 1990, Psychophysiology.

[14]  M. Steinschneider,et al.  Neurophysiological evidence for context-dependent encoding of sensory input in human auditory cortex , 2006, Brain Research.

[15]  I. Winkler,et al.  The development of the perceptual organization of sound by frequency separation in 5–11-year-old children , 2007, Hearing Research.

[16]  W. Ritter,et al.  Attention affects the organization of auditory input associated with the mismatch negativity system , 1998, Brain Research.

[17]  E. Schröger,et al.  Auditory streaming affects the processing of successive deviant and standard sounds. , 2005, Psychophysiology.

[18]  J. Pernier,et al.  Dynamics of cortico-subcortical cross-modal operations involved in audio-visual object detection in humans. , 2002, Cerebral cortex.

[19]  John J. Foxe,et al.  Preattentively grouped tones do not elicit MMN with respect to each other. , 2006, Psychophysiology.

[20]  S. Shimojo,et al.  Illusions: What you see is what you hear , 2000, Nature.

[21]  W. Ritter,et al.  An investigation of the auditory streaming effect using event-related brain potentials. , 1999, Psychophysiology.

[22]  E. Schröger,et al.  Speeded responses to audiovisual signal changes result from bimodal integration. , 1998, Psychophysiology.

[23]  A. King,et al.  Multisensory integration: perceptual grouping by eye and ear , 2001, Current Biology.

[24]  P. Bertelson,et al.  The ventriloquist effect does not depend on the direction of automatic visual attention , 2001, Perception & psychophysics.

[25]  A Tales,et al.  Mismatch negativity in the visual modality. , 1999, Neuroreport.

[26]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[27]  Elyse S Sussman,et al.  Integration and segregation in auditory scene analysis. , 2005, The Journal of the Acoustical Society of America.