Look who's talking: The deployment of visuo-spatial attention during multisensory speech processing under noisy environmental conditions

In a crowded scene we can effectively focus our attention on a specific speaker while largely ignoring sensory inputs from other speakers. How attended speech inputs are extracted from similar competing information has been primarily studied in the auditory domain. Here we examined the deployment of visuo-spatial attention in multiple speaker scenarios. Steady-state visual evoked potentials (SSVEP) were monitored as a real-time index of visual attention towards three competing speakers. Participants were instructed to detect a target syllable by the center speaker and ignore syllables from two flanking speakers. The study incorporated interference trials (syllables from three speakers), no-interference trials (syllable from center speaker only), and periods without speech stimulation in which static faces were presented. An enhancement of flanking speaker induced SSVEP was found 70-220 ms after sound onset over left temporal scalp during interference trials. This enhancement was negatively correlated with the behavioral performance of participants -- those who showed largest enhancements had the worst speech recognition performance. Additionally, poorly performing participants exhibited enhanced flanking speaker induced SSVEP over visual scalp during periods without speech stimulation. The present study provides neurophysiologic evidence that the deployment of visuo-spatial attention to flanking speakers interferes with the recognition of multisensory speech signals under noisy environmental conditions.

[1]  W. Lutzenberger,et al.  Sequential audiovisual interactions during speech perception: A whole-head MEG study , 2007, Neuropsychologia.

[2]  John K. Tsotsos,et al.  Direct neurophysiological evidence for spatial suppression surrounding the focus of attention in vision. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Barak A. Pearlmutter,et al.  Sparse Representations for the Cocktail Party Problem , 2006, The Journal of Neuroscience.

[4]  Michael F. Bunting,et al.  The cocktail party phenomenon revisited: The importance of working memory capacity , 2001, Psychonomic bulletin & review.

[5]  R. Carlyon How the brain separates sounds , 2004, Trends in Cognitive Sciences.

[6]  R. Campbell,et al.  Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex , 2000, Current Biology.

[7]  S. A. Hillyard,et al.  Sustained division of the attentional spotlight , 2003, Nature.

[8]  R. Desimone,et al.  Neural mechanisms of spatial selective attention in areas V1, V2, and V4 of macaque visual cortex. , 1997, Journal of neurophysiology.

[9]  Daniel Senkowski,et al.  Multisensory processing and oscillatory gamma responses: effects of spatial selective attention , 2005, Experimental Brain Research.

[10]  W. H. Sumby,et al.  Visual contribution to speech intelligibility in noise , 1954 .

[11]  B. Shinn-Cunningham,et al.  Task-modulated “what” and “where” pathways in human auditory cortex , 2006, Proceedings of the National Academy of Sciences.

[12]  Marty G. Woldorff,et al.  Selective Attention and Multisensory Integration: Multiple Phases of Effects on the Evoked Brain Activity , 2005, Journal of Cognitive Neuroscience.

[13]  A. Kleinschmidt,et al.  The attentional field has a Mexican hat distribution , 2005, Vision Research.

[14]  Matthias M. Müller,et al.  The time course of cortical facilitation during cued shifts of spatial attention , 1998, Nature Neuroscience.

[15]  Laura Busse,et al.  The ERP omitted stimulus response to “no-stim” events and its implications for fast-rate event-related fMRI designs , 2003, NeuroImage.

[16]  Simon Haykin,et al.  The Cocktail Party Problem , 2005, Neural Computation.

[17]  Lee M. Miller,et al.  Behavioral/systems/cognitive Perceptual Fusion and Stimulus Coincidence in the Cross- Modal Integration of Speech , 2022 .

[18]  Gregory V. Simpson,et al.  Biasing the brain’s attentional set: II. Effects of selective intersensory attentional deployments on subsequent sensory processing , 2005, Experimental Brain Research.

[19]  M. Woldorff,et al.  Distortion of ERP averages due to overlap from temporally adjacent ERPs: analysis and correction. , 2007, Psychophysiology.

[20]  A. Ghazanfar,et al.  Is neocortex essentially multisensory? , 2006, Trends in Cognitive Sciences.

[21]  Jeffery A. Jones,et al.  Multisensory Integration Sites Identified by Perception of Spatial Wavelet Filtered Visual Speech Gesture Information , 2004, Journal of Cognitive Neuroscience.

[22]  Salvador Soto-Faraco,et al.  Attention to touch weakens audiovisual speech integration , 2007, Experimental Brain Research.

[23]  John J. Foxe,et al.  The case for feedforward multisensory convergence during early cortical processing , 2005, Neuroreport.

[24]  Joost X. Maier,et al.  Multisensory Integration of Dynamic Faces and Voices in Rhesus Monkey Auditory Cortex , 2005 .

[25]  Frederick J. Gallun,et al.  The advantage of knowing where to listen. , 2005, The Journal of the Acoustical Society of America.

[26]  N. Cowan,et al.  The cocktail party phenomenon revisited: attention and memory in the classic selective listening procedure of Cherry (1953). , 1995, Journal of experimental psychology. General.

[27]  S. Yantis,et al.  Control of Attention Shifts between Vision and Audition in Human Cortex , 2004, The Journal of Neuroscience.

[28]  Matthias M. Müller,et al.  Effects of spatial selective attention on the steady-state visual evoked potential in the 20-28 Hz range. , 1998, Brain research. Cognitive brain research.

[29]  John J. Foxe,et al.  Impaired multisensory processing in schizophrenia: Deficits in the visual enhancement of speech comprehension under noisy environmental conditions , 2007, Schizophrenia Research.

[30]  G. V. Simpson,et al.  Parieto‐occipital ∼1 0Hz activity reflects anticipatory state of visual attention mechanisms , 1998 .

[31]  Leslie G. Ungerleider,et al.  Mechanisms of directed attention in the human extrastriate cortex as revealed by functional MRI. , 1998, Science.

[32]  A. Fort,et al.  Bimodal speech: early suppressive visual effects in human auditory cortex , 2004, The European journal of neuroscience.

[33]  S. Hillyard,et al.  Selective attention to stimulus location modulates the steady-state visual evoked potential. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[34]  C. Herrmann,et al.  Gamma responses and ERPs in a visual classification task , 1999, Clinical Neurophysiology.

[35]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[36]  S. Hillyard,et al.  Combining steady‐state visual evoked potentials and f MRI to localize brain activity during selective attention , 1997, Human brain mapping.

[37]  L. Busse,et al.  The spread of attention across modalities and space in a multisensory object. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[38]  R. Campbell,et al.  Audiovisual Integration of Speech Falters under High Attention Demands , 2005, Current Biology.

[39]  R. Desimone,et al.  Neural mechanisms of selective visual attention. , 1995, Annual review of neuroscience.

[40]  John J. Foxe,et al.  Do you see what I am saying? Exploring visual enhancement of speech comprehension in noisy environments. , 2006, Cerebral cortex.

[41]  S. Kastner,et al.  Stimulus context modulates competition in human extrastriate cortex , 2005, Nature Neuroscience.

[42]  I. Winkler,et al.  Involuntary Attention and Distractibility as Evaluated with Event-Related Brain Potentials , 2000, Audiology and Neurotology.

[43]  John J. Foxe,et al.  Seeing voices: High-density electrical mapping and source-analysis of the multisensory mismatch negativity evoked during the McGurk illusion , 2007, Neuropsychologia.

[44]  Daniel Senkowski,et al.  Good times for multisensory integration: Effects of the precision of temporal synchrony as revealed by gamma-band oscillations , 2007, Neuropsychologia.

[45]  C. Herrmann Human EEG responses to 1–100 Hz flicker: resonance phenomena in visual cortex and their potential correlation to cognitive phenomena , 2001, Experimental Brain Research.

[46]  John J. Foxe,et al.  Oscillatory beta activity predicts response speed during a multisensory audiovisual reaction time task: a high-density electrical mapping study. , 2005, Cerebral cortex.