Integration of auditory and visual information in the recognition of realistic objects

Recognizing a natural object requires one to pool information from various sensory modalities, and to ignore information from competing objects. That the same semantic knowledge can be accessed through different modalities makes it possible to explore the retrieval of supramodal object concepts. Here, object-recognition processes were investigated by manipulating the relationships between sensory modalities, specifically, semantic content, and spatial alignment between auditory and visual information. Experiments were run under realistic virtual environment. Participants were asked to react as fast as possible to a target object presented in the visual and/or the auditory modality and to inhibit a distractor object (go/no-go task). Spatial alignment had no effect on object-recognition time. The only spatial effect observed was a stimulus–response compatibility between the auditory stimulus and the hand position. Reaction times were significantly shorter for semantically congruent bimodal stimuli than would be predicted by independent processing of information about the auditory and visual targets. Interestingly, this bimodal facilitation effect was twice as large as found in previous studies that also used information-rich stimuli. An interference effect was observed (i.e. longer reaction times to semantically incongruent stimuli than to the corresponding unimodal stimulus) only when the distractor was auditory. When the distractor was visual, the semantic incongruence did not interfere with object recognition. Our results show that immersive displays with large visual stimuli may provide large multimodal integration effects, and reveal a possible asymmetry in the attentional filtering of irrelevant auditory and visual information.

[1]  D. Raab DIVISION OF PSYCHOLOGY: STATISTICAL FACILITATION OF SIMPLE REACTION TIMES* , 1962 .

[2]  G. Grice,et al.  Dependence of target redundancy effects on noise conditions and number of targets , 1987, Perception & psychophysics.

[3]  John J. Foxe,et al.  Grabbing your ear: rapid auditory-somatosensory multisensory interactions in low-level sensory cortices are not constrained by stimulus alignment. , 2005, Cerebral cortex.

[4]  Glyn W. Humphreys,et al.  Semantic systems or system? Neuropsychological evidence re-examined , 1988 .

[5]  Steven A. Hillyard,et al.  Effects of Spatial Congruity on Audio-Visual Multimodal Integration , 2005, Journal of Cognitive Neuroscience.

[6]  R. A. Kinchla,et al.  Detecting target elements in multielement arrays: A confusability model , 1974 .

[7]  G R Grice,et al.  What makes targets redundant? , 1992, Perception & psychophysics.

[8]  P. Bertelson,et al.  Adaptation to auditory-visual discordance and ventriloquism in semirealistic situations , 1977 .

[9]  John J. Foxe,et al.  Multisensory visual-auditory object recognition in humans: a high-density electrical mapping study. , 2004, Cerebral cortex.

[10]  M. Giard,et al.  Auditory-Visual Integration during Multimodal Object Recognition in Humans: A Behavioral and Electrophysiological Study , 1999, Journal of Cognitive Neuroscience.

[11]  W. Schwarz,et al.  Further tests of the interactive race model of divided attention: The effects of negative bias and varying stimulus — onset asynchronies , 1996 .

[12]  Michael J Brammer,et al.  Crossmodal identification , 1998, Trends in Cognitive Sciences.

[13]  Alfonso Caramazza,et al.  The multiple semantics hypothesis: Multiple confusions? , 1990 .

[14]  Jeff Miller,et al.  Divided attention: Evidence for coactivation with redundant signals , 1982, Cognitive Psychology.

[15]  David Alais,et al.  Separate attentional resources for vision and audition , 2006, Proceedings of the Royal Society B: Biological Sciences.

[16]  E. Schröger,et al.  Speeded responses to audiovisual signal changes result from bimodal integration. , 1998, Psychophysiology.

[17]  G R Grice,et al.  Redundancy phenomena are affected by response requirements , 1990, Perception & psychophysics.

[18]  C. Spence,et al.  Multisensory Integration: Space, Time and Superadditivity , 2005, Current Biology.

[19]  N. Bolognini,et al.  Enhancement of visual perception by crossmodal visuo-auditory interaction , 2002, Experimental Brain Research.

[20]  R. Duncan Luce,et al.  Response Times: Their Role in Inferring Elementary Mental Organization , 1986 .

[21]  Sidney S. Simon,et al.  Merging of the Senses , 2008, Front. Neurosci..

[22]  George Drettakis,et al.  Progressive perceptual audio rendering of complex scenes , 2007, SI3D.

[23]  B. Stein,et al.  Enhancement of Perceived Visual Intensity by Auditory Stimuli: A Psychophysical Analysis , 1996, Journal of Cognitive Neuroscience.

[24]  J Miller,et al.  Channel interaction and the redundant-targets effect in bimodal divided attention. , 1991, Journal of experimental psychology. Human perception and performance.

[25]  M. Zorzi,et al.  A computational model of the Simon effect , 1995, Psychological research.

[26]  Brigitte Röder,et al.  Multisensory processing in the redundant-target effect: A behavioral and event-related potential study , 2005, Perception & psychophysics.

[27]  M. Murray,et al.  The role of multisensory memories in unisensory object discrimination. , 2005, Brain research. Cognitive brain research.

[28]  G R Grice,et al.  Absence of a redundant-signals effect in a reaction time task with divided attention , 1984, Perception & psychophysics.

[29]  R. Proctor,et al.  The influence of irrelevant location information on performance: A review of the Simon and spatial Stroop effects , 1995, Psychonomic bulletin & review.

[30]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[31]  A Postma,et al.  Interactions between Exogenous Auditory and Visual Spatial Attention , 2000, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[32]  Jeff Miller,et al.  Information processing models generating lognormally distributed reaction times , 1993 .

[33]  B. Mesquita,et al.  Adjustment to Chronic Diseases and Terminal Illness Health Psychology : Psychological Adjustment to Chronic Disease , 2006 .

[34]  C. K. Peck,et al.  Spatial disparity affects visual-auditory interactions in human sensorimotor processing , 1998, Experimental Brain Research.

[35]  M HERSHENSON,et al.  Reaction time as a measure of intersensory facilitation. , 1962, Journal of experimental psychology.

[36]  J. Richard Simon,et al.  Effect of compatibility of S-R mapping on reactions toward the stimulus source. , 1981 .

[37]  Micah M. Murray,et al.  Auditory–somatosensory multisensory interactions in front and rear space , 2007, Neuropsychologia.

[38]  Rolf Ulrich,et al.  Testing the race model inequality: An algorithm and computer programs , 2007, Behavior research methods.

[39]  P Bertelson,et al.  Cognitive factors and adaptation to auditory-visual discordance , 1978, Perception & psychophysics.

[40]  M. Grabowecky,et al.  Auditory-Visual Crossmodal Integration in Perception of Face Gender , 2007, Current Biology.

[41]  P. Lachenbruch Statistical Power Analysis for the Behavioral Sciences (2nd ed.) , 1989 .

[42]  G. Calvert,et al.  Multisensory integration: methodological approaches and emerging principles in the human brain , 2004, Journal of Physiology-Paris.

[43]  C. Marzi,et al.  Does the redundant signal effect occur at an early visual stage? , 2007, Experimental Brain Research.

[44]  M. Frens,et al.  Spatial and temporal factors determine auditory-visual interactions in human saccadic eye movements , 1995, Perception & psychophysics.

[45]  Paul J. Laurienti,et al.  Semantic congruence is a critical factor in multisensory behavioral performance , 2004, Experimental Brain Research.

[46]  Shlomit Yuval-Greenberg,et al.  What You See Is Not (Always) What You Hear: Induced Gamma Band Responses Reflect Cross-Modal Interactions in Familiar Object Recognition , 2007, The Journal of Neuroscience.

[47]  Felice L. Bedford,et al.  Analysis of a constraint on perception, cognition, and development: one object, one place, one time. , 2004, Journal of experimental psychology. Human perception and performance.

[48]  M Giray,et al.  Motor coactivation revealed by response force in divided and focused attention. , 1993, Journal of experimental psychology. Human perception and performance.

[49]  Béatrice de Gelder,et al.  Exploring the relation between mcgurk interference and ventriloquism , 1994, ICSLP.

[50]  Jeff Miller,et al.  Timecourse of coactivation in bimodal divided attention , 1986, Perception & psychophysics.

[51]  T. Rogers,et al.  Where do you know what you know? The representation of semantic knowledge in the human brain , 2007, Nature Reviews Neuroscience.

[52]  D. Raab Statistical facilitation of simple reaction times. , 1962, Transactions of the New York Academy of Sciences.

[53]  J R Simon,et al.  Effects of an irrelevant auditory stimulus on visual choice reaction time. , 1970, Journal of experimental psychology.

[54]  J R Simon,et al.  Effect of Compatibility of S-R Mapping on Reactions toward the Stimulus Source , 1979, Acta psychologica.

[55]  P. Reuter-Lorenz,et al.  Visual-auditory interactions in sensorimotor processing: saccades versus manual responses. , 1994, Journal of experimental psychology. Human perception and performance.

[56]  S. Yantis,et al.  An interactive race model of divided attention. , 1991, Journal of experimental psychology. Human perception and performance.