Top-down and bottom-up modulation in processing bimodal face/voice stimuli

BackgroundProcessing of multimodal information is a critical capacity of the human brain, with classic studies showing bimodal stimulation either facilitating or interfering in perceptual processing. Comparing activity to congruent and incongruent bimodal stimuli can reveal sensory dominance in particular cognitive tasks.ResultsWe investigated audiovisual interactions driven by stimulus properties (bottom-up influences) or by task (top-down influences) on congruent and incongruent simultaneously presented faces and voices while ERPs were recorded. Subjects performed gender categorisation, directing attention either to faces or to voices and also judged whether the face/voice stimuli were congruent in terms of gender. Behaviourally, the unattended modality affected processing in the attended modality: the disruption was greater for attended voices. ERPs revealed top-down modulations of early brain processing (30-100 ms) over unisensory cortices. No effects were found on N170 or VPP, but from 180-230 ms larger right frontal activity was seen for incongruent than congruent stimuli.ConclusionsOur data demonstrates that in a gender categorisation task the processing of faces dominate over the processing of voices. Brain activity showed different modulation by top-down and bottom-up information. Top-down influences modulated early brain activity whereas bottom-up interactions occurred relatively late.

[1]  Durk Talsma,et al.  Attentional orienting across the sensory modalities , 2008, Brain and Cognition.

[2]  Jesse S. Husk,et al.  Age-related delay in information accrual for faces: Evidence from a parametric, single-trial EEG approach , 2009, BMC Neuroscience.

[3]  C. Spence,et al.  The Handbook of Multisensory Processing , 2004 .

[4]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[5]  S. Shimojo,et al.  Sensory modalities are not separate modalities: plasticity and interactions , 2001, Current Opinion in Neurobiology.

[6]  Margot J. Taylor,et al.  Eyes first! Eye processing develops before face processing in children , 2001, Neuroreport.

[7]  M. Woldorff,et al.  Selective attention and audiovisual integration: is attending to both modalities a prerequisite for early integration? , 2006, Cerebral cortex.

[8]  M. Grabowecky,et al.  Auditory-Visual Crossmodal Integration in Perception of Face Gender , 2007, Current Biology.

[9]  lhealtlhy youin-g,et al.  Hospital for Sick Children , 1857, British medical journal.

[10]  T. Allison,et al.  Electrophysiological studies of human face perception. III: Effects of top-down processing on face-specific potentials. , 1999, Cerebral cortex.

[11]  S. Iversen,et al.  Detection of Audio-Visual Integration Sites in Humans by Application of Electrophysiological Criteria to the BOLD Effect , 2001, NeuroImage.

[12]  J Pernier,et al.  Neurophysiological mechanisms of auditory selective attention in humans. , 2000, Frontiers in bioscience : a journal and virtual library.

[13]  J. Pernier,et al.  Early auditory-visual interactions in human cortex during nonredundant target identification. , 2002, Brain research. Cognitive brain research.

[14]  R. Campbell,et al.  Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex , 2000, Current Biology.

[15]  Martin Eimer,et al.  Crossmodal links in spatial attention between vision, audition, and touch: evidence from event-related brain potentials , 2001, Neuropsychologia.

[16]  J. Davidoff,et al.  Brain events related to normal and moderately scrambled faces. , 1996, Brain research. Cognitive brain research.

[17]  B. Rossion,et al.  Task modulation of brain activity related to familiar and unfamiliar face processing: an ERP study , 1999, Clinical Neurophysiology.

[18]  E. Bullmore,et al.  Response amplification in sensory-specific cortices during crossmodal binding. , 1999, Neuroreport.

[19]  John J. Foxe,et al.  Multisensory visual-auditory object recognition in humans: a high-density electrical mapping study. , 2004, Cerebral cortex.

[20]  S. Kojima,et al.  Matching vocalizations to vocalizing faces in a chimpanzee (Pan troglodytes) , 2004, Animal Cognition.

[21]  M. Tarr,et al.  The N170 occipito‐temporal component is delayed and enhanced to inverted faces but not to inverted objects: an electrophysiological account of face‐specific processes in the human brain , 2000, Neuroreport.

[22]  Jon Driver,et al.  Is cross-modal integration of emotional expressions independent of attentional resources? , 2001, Cognitive, affective & behavioral neuroscience.

[23]  Julia Kastner,et al.  Introduction to Robust Estimation and Hypothesis Testing , 2005 .

[24]  J. Desmond,et al.  Functional Specialization for Semantic and Phonological Processing in the Left Inferior Prefrontal Cortex , 1999, NeuroImage.

[25]  P. Bertelson,et al.  Cross-modal bias and perceptual fusion with auditory-visual spatial discordance , 1981, Perception & psychophysics.

[26]  S A Hillyard,et al.  An analysis of audio-visual crossmodal integration by means of event-related potential (ERP) recordings. , 2002, Brain research. Cognitive brain research.

[27]  Margot J. Taylor,et al.  Effects of repetition and configural changes on the development of face recognition processes. , 2004, Developmental science.

[28]  J. Haxby,et al.  The distributed human neural system for face perception , 2000, Trends in Cognitive Sciences.

[29]  T. Allison,et al.  Electrophysiological Studies of Face Perception in Humans , 1996, Journal of Cognitive Neuroscience.

[30]  Lisa A. Parr,et al.  Perceptual biases for multimodal cues in chimpanzee (Pan troglodytes) affect recognition , 2004, Animal Cognition.

[31]  Burkhard Maess,et al.  Dissociation of human and computer voices in the brain: Evidence for a preattentive gestalt‐like perception , 2003, Human brain mapping.

[32]  S. Shimojo,et al.  Sound alters visual evoked potentials in humans , 2001, Neuroreport.

[33]  M. Giard,et al.  Auditory-Visual Integration during Multimodal Object Recognition in Humans: A Behavioral and Electrophysiological Study , 1999, Journal of Cognitive Neuroscience.

[34]  C. Spence,et al.  Attracting attention to the illusory location of a sound: reflexive crossmodal orienting and ventriloquism , 2000, Neuroreport.

[35]  BMC Neuroscience , 2003 .

[36]  Pascal Belin,et al.  Electrophysiological evidence for an early processing of human voices , 2009, BMC Neuroscience.

[37]  C. Jacques,et al.  Concurrent processing reveals competition between visual representations of faces , 2004, Neuroreport.

[38]  A. Amedi,et al.  Functional imaging of human crossmodal identification and object recognition , 2005, Experimental Brain Research.

[39]  Mikko Sams,et al.  Human brain activity associated with audiovisual perception and attention , 2007, NeuroImage.

[40]  A. Fort,et al.  Bimodal speech: early suppressive visual effects in human auditory cortex , 2004, The European journal of neuroscience.

[41]  Marty G. Woldorff,et al.  Selective Attention and Multisensory Integration: Multiple Phases of Effects on the Evoked Brain Activity , 2005, Journal of Cognitive Neuroscience.

[42]  R Näätänen,et al.  Small pitch separation and the selective-attention effect on the ERP. , 1986, Psychophysiology.

[43]  A. Wagner,et al.  Working Memory Contributions to Human Learning and Remembering , 1999, Neuron.

[44]  S. Hillyard,et al.  Involuntary orienting to sound improves visual perception , 2000, Nature.

[45]  B. Rossion,et al.  The time‐course of intermodal binding between seeing and hearing affective information , 2000, Neuroreport.

[46]  Gilles Pourtois,et al.  Fear recognition in the voice is modulated by unconsciously recognized facial expressions but not by unconsciously recognized affective pictures , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[47]  Mikko Sams,et al.  Factors influencing audiovisual fission and fusion illusions. , 2004, Brain research. Cognitive brain research.

[48]  D. Jeffreys,et al.  The influence of stimulus orientation on the vertex positive scalp potential evoked by faces , 1993, Experimental Brain Research.

[49]  M. Crommelinck,et al.  When audition alters vision: an event-related potential study of the cross-modal interactions between faces and voices , 2004, Neuroscience Letters.

[50]  G. Pourtois,et al.  Perception of Facial Expressions and Voices and of their Combination in the Human Brain , 2005, Cortex.

[51]  J. Pernier,et al.  Dynamics of cortico-subcortical cross-modal operations involved in audio-visual object detection in humans. , 2002, Cerebral cortex.

[52]  Margot J. Taylor,et al.  Is the face‐sensitive N170 the only ERP not affected by selective attention? , 2000, Neuroreport.

[53]  J. Vroomen,et al.  The perception of emotions by ear and by eye , 2000 .

[54]  M. Hallett,et al.  Neural Correlates of Auditory–Visual Stimulus Onset Asynchrony Detection , 2001, The Journal of Neuroscience.

[55]  P. Belin,et al.  Electrophysiological markers of voice familiarity , 2006, The European journal of neuroscience.

[56]  Paul Bertelson,et al.  Temporal ventriloquism: crossmodal interaction on the time dimension. 2. Evidence from sensorimotor synchronization. , 2003, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[57]  Margot J. Taylor,et al.  Guidelines for using human event-related potentials to study cognition: recording standards and publication criteria. , 2000, Psychophysiology.

[58]  S. Bentin,et al.  Processing specificity for human voice stimuli: electrophysiological evidence , 2001, Neuroreport.

[59]  P. Belin,et al.  Thinking the voice: neural correlates of voice perception , 2004, Trends in Cognitive Sciences.

[60]  Roxane J. Itier,et al.  Face, eye and object early processing: What is the face specificity? , 2006, NeuroImage.

[61]  Margot J. Taylor,et al.  N170 or N1? Spatiotemporal differences between object and face processing using ERPs. , 2004, Cerebral cortex.

[62]  Norimichi Kitagawa,et al.  Audio-visual integration in temporal perception. , 2003, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[63]  R. Zatorre,et al.  Voice-selective areas in human auditory cortex , 2000, Nature.

[64]  P. Belin,et al.  A “voice inversion effect?” , 2004, Brain and Cognition.

[65]  Leila Reddy,et al.  Face-gender discrimination is possible in the near-absence of attention. , 2004, Journal of vision.

[66]  Gian Luca Romani,et al.  “What” versus “Where” in the audiovisual domain: An fMRI study , 2006, NeuroImage.

[67]  Jeffery A. Jones,et al.  Neural processes underlying perceptual enhancement by visual speech gestures , 2003, Neuroreport.

[68]  D K Prasher,et al.  Latency variability and temporal interrelationships of the auditory event-related potentials (N1, P2, N2, and P3) in normal subjects. , 1986, Electroencephalography and clinical neurophysiology.

[69]  Shlomo Bentin,et al.  Neural sensitivity to human voices: ERP evidence of task and attentional influences. , 2003, Psychophysiology.

[70]  J. Driver Enhancement of selective listening by illusory mislocation of speech sounds due to lip-reading , 1996, Nature.

[71]  M. Bindemann,et al.  Faces retain attention , 2005, Psychonomic bulletin & review.

[72]  Ilona Berkovits,et al.  Bootstrap Resampling Approaches for Repeated Measure Designs: Relative Robustness to Sphericity and Normality Violations , 2000 .

[73]  G. Aschersleben,et al.  Temporal ventriloquism: crossmodal interaction on the time dimension. 1. Evidence from auditory-visual temporal order judgment. , 2003, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[74]  John J. Foxe,et al.  Multisensory auditory-visual interactions during early sensory processing in humans: a high-density electrical mapping study. , 2002, Brain research. Cognitive brain research.

[75]  P. Vuilleumier,et al.  Faces call for attention: evidence from patients with visual extinction , 2000, Neuropsychologia.

[76]  T. Picton,et al.  The N1 wave of the human electric and magnetic response to sound: a review and an analysis of the component structure. , 1987, Psychophysiology.

[77]  R. Dolan,et al.  Crossmodal binding of fear in voice and face , 2001, Proceedings of the National Academy of Sciences of the United States of America.