A Multisensory Cortical Network for Understanding Speech in Noise

In noisy environments, listeners tend to hear a speaker's voice yet struggle to understand what is said. The most effective way to improve intelligibility in such conditions is to watch the speaker's mouth movements. Here we identify the neural networks that distinguish understanding from merely hearing speech, and determine how the brain applies visual information to improve intelligibility. Using functional magnetic resonance imaging, we show that understanding speech-in-noise is supported by a network of brain areas including the left superior parietal lobule, the motor/premotor cortex, and the left anterior superior temporal sulcus (STS), a likely apex of the acoustic processing hierarchy. Multisensory integration likely improves comprehension through improved communication between the left temporal–occipital boundary, the left medial-temporal lobe, and the left STS. This demonstrates how the brain uses information from multiple modalities to improve speech comprehension in naturalistic, acoustically adverse conditions.

[1]  C. Benoît,et al.  Effects of phonetic context on audio-visual intelligibility of French. , 1994, Journal of speech and hearing research.

[2]  M. Buckley The Role of the Perirhinal Cortex and Hippocampus in Learning, Memory, and Perception , 2005, The Quarterly journal of experimental psychology. B, Comparative and physiological psychology.

[3]  S. Posse,et al.  Intensity coding of auditory stimuli: an fMRI study , 1998, Neuropsychologia.

[4]  Adam Gazzaley,et al.  Measuring functional connectivity during distinct stages of a cognitive task , 2004, NeuroImage.

[5]  T. Griffiths,et al.  What is an auditory object? , 2004, Nature Reviews Neuroscience.

[6]  A. Macleod,et al.  Quantifying the contribution of vision to speech perception in noise. , 1987, British journal of audiology.

[7]  Matthew H. Davis,et al.  The neural mechanisms of speech comprehension: fMRI studies of semantic ambiguity. , 2005, Cerebral cortex.

[8]  E. Liebenthal,et al.  Short-Term Reorganization of Auditory Analysis Induced by Phonetic Experience , 2003, Journal of Cognitive Neuroscience.

[9]  Wilkin Chau,et al.  Left thalamo-cortical network implicated in successful speech separation and identification , 2005, NeuroImage.

[10]  P F Seitz,et al.  The use of visible speech cues for improving auditory detection of spoken sentences. , 2000, The Journal of the Acoustical Society of America.

[11]  J. Gore,et al.  A comparison of bound and unbound audio-visual information processing in the human cerebral cortex. , 2002, Brain research. Cognitive brain research.

[12]  Jeffrey R. Binder,et al.  Left Posterior Temporal Regions are Sensitive to Auditory Categorization , 2008, Journal of Cognitive Neuroscience.

[13]  S. Scott,et al.  Functional Integration across Brain Regions Improves Speech Perception under Adverse Listening Conditions , 2007, The Journal of Neuroscience.

[14]  B. Milner,et al.  The effect of presentation rate on the comprehension and recall of speech after anterior temporal-lobe resection , 1994, Neuropsychologia.

[15]  E. T. Possing,et al.  Human temporal lobe activation by speech and nonspeech sounds. , 2000, Cerebral cortex.

[16]  Stephen M. Rao,et al.  Human Brain Language Areas Identified by Functional Magnetic Resonance Imaging , 1997, The Journal of Neuroscience.

[17]  L. M. Warner,et al.  The Neural Mechanisms for Minimizing Cross-Modal Distraction , 2004, The Journal of Neuroscience.

[18]  J. Rauschecker,et al.  Mechanisms and streams for processing of "what" and "where" in auditory cortex. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[19]  R U Muller,et al.  Variable place-cell coupling to a continuously viewed stimulus: evidence that the hippocampus acts as a perceptual system. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[20]  Gregory McCarthy,et al.  Polysensory interactions along lateral temporal regions evoked by audiovisual speech. , 2003, Cerebral cortex.

[21]  Y. Sugita,et al.  Auditory-visual speech perception examined by fMRI and PET , 2003, Neuroscience Research.

[22]  K. G. Munhall,et al.  Spatial frequency requirements for audiovisual speech perception , 2004, Perception & psychophysics.

[23]  Sarah Shomstein,et al.  Parietal Cortex Mediates Voluntary Control of Spatial and Nonspatial Auditory Attention , 2006, The Journal of Neuroscience.

[24]  R J Wise,et al.  Separate neural subsystems within 'Wernicke's area'. , 2001, Brain : a journal of neurology.

[25]  S. Scott,et al.  The neuroanatomical and functional organization of speech perception , 2003, Trends in Neurosciences.

[26]  L. Saksida,et al.  Visual perception and memory: a new view of medial temporal lobe function in primates and rodents. , 2007, Annual review of neuroscience.

[27]  B. Argall,et al.  Unraveling multisensory integration: patchy organization within human STS multisensory cortex , 2004, Nature Neuroscience.

[28]  David A. Medler,et al.  Neural correlates of sensory and decision processes in auditory object identification , 2004, Nature Neuroscience.

[29]  G. Thierry,et al.  Renewal of the neurophysiology of language: functional neuroimaging. , 2005, Physiological reviews.

[30]  R. Zatorre Do You See What I'm Saying? Interactions between Auditory and Visual Cortices in Cochlear Implant Users , 2001, Neuron.

[31]  Roy D. Patterson,et al.  Locating the initial stages of speech–sound processing in human temporal cortex , 2006, NeuroImage.

[32]  Mikko Sams,et al.  Processing of audiovisual speech in Broca's area , 2005, NeuroImage.

[33]  David Poeppel,et al.  Visual speech speeds up the neural processing of auditory speech. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Willy Serniclaes,et al.  Neural correlates of switching from auditory to speech perception , 2005, NeuroImage.

[35]  J. Haxby,et al.  fMRI Responses to Video and Point-Light Displays of Moving Humans and Manipulable Objects , 2003, Journal of Cognitive Neuroscience.

[36]  R. Dolan,et al.  The Nose Smells What the Eye Sees Crossmodal Visual Facilitation of Human Olfactory Perception , 2003, Neuron.

[37]  David A. Medler,et al.  Cerebral Cortex doi:10.1093/cercor/bhi040 Cerebral Cortex Advance Access published February 9, 2005 , 2022 .

[38]  Lee M. Miller,et al.  Behavioral/systems/cognitive Perceptual Fusion and Stimulus Coincidence in the Cross- Modal Integration of Speech , 2022 .

[39]  Erika Skoe,et al.  Perception of Speech in Noise: Neural Correlates , 2011, Journal of Cognitive Neuroscience.

[40]  J. Fell,et al.  Language processing within the human medial temporal lobe , 2005, Hippocampus.

[41]  M. Arbib,et al.  Language within our grasp , 1998, Trends in Neurosciences.

[42]  J. Knutson,et al.  The relationship between communication problems and psychological difficulties in persons with profound acquired hearing loss. , 1990, The Journal of speech and hearing disorders.

[43]  E. Vatikiotis-Bateson,et al.  Perceiving Biological Motion: Dissociating Visible Speech from Walking , 2003, Journal of Cognitive Neuroscience.

[44]  Lynne E. Bernstein,et al.  Spatiotemporal dynamics of audiovisual speech processing , 2008, NeuroImage.

[45]  Zita Márkus,et al.  Multisensory integration in the basal ganglia , 2006, The European journal of neuroscience.

[46]  E Ahissar,et al.  Speech comprehension is correlated with temporal response patterns recorded from auditory cortex , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[47]  Deborah A. Hall,et al.  Reading Fluent Speech from Talking Faces: Typical Brain Networks and Individual Differences , 2005, Journal of Cognitive Neuroscience.

[48]  Klucharev Vasily,et al.  Electrophysiological indicators of phonetic and non-phonetic multisensory interactions during audiovisual speech perception. , 2003 .

[49]  Alex Martin,et al.  Modulation of human medial temporal lobe activity by form, meaning, and experience , 1997, Hippocampus.

[50]  T. Griffiths,et al.  Opinion: What is an auditory object? , 2004 .

[51]  P. McGuire,et al.  Silent speechreading in the absence of scanner noise: an event‐related fMRI study , 2000, Neuroreport.

[52]  D E Callan,et al.  Multimodal contribution to speech perception revealed by independent component analysis: a single-sweep EEG case study. , 2001, Brain research. Cognitive brain research.

[53]  Lukas Scheef,et al.  Mediotemporal contributions to semantic processing: fMRI evidence from ambiguity processing during semantic context verification , 2005, Hippocampus.

[54]  Emily B. Myers,et al.  The Perception of Voice Onset Time: An fMRI Investigation of Phonetic Category Structure , 2005, Journal of Cognitive Neuroscience.

[55]  Antoine J. Shahin,et al.  Multisensory integration enhances phonemic restoration. , 2009, The Journal of the Acoustical Society of America.

[56]  Matthew H. Davis,et al.  Hierarchical Processing in Spoken Language Comprehension , 2003, The Journal of Neuroscience.

[57]  T. Allison,et al.  Social perception from visual cues: role of the STS region , 2000, Trends in Cognitive Sciences.

[58]  O Josephs,et al.  Dissociable Human Perirhinal, Hippocampal, and Parahippocampal Roles during Verbal Encoding , 2002, The Journal of Neuroscience.

[59]  C. Fiebach,et al.  The role of left inferior frontal and superior temporal cortex in sentence comprehension: localizing syntactic and semantic processes. , 2003, Cerebral cortex.

[60]  G. Calvert Crossmodal processing in the human brain: insights from functional neuroimaging studies. , 2001, Cerebral cortex.

[61]  G Hickok,et al.  Role of anterior temporal cortex in auditory sentence comprehension: an fMRI study , 2001, Neuroreport.

[62]  Lee M. Miller,et al.  Auditory attentional control and selection during cocktail party listening. , 2010, Cerebral cortex.

[63]  W. H. Sumby,et al.  Visual contribution to speech intelligibility in noise , 1954 .

[64]  G. W. Hoesen Anatomy of the medial temporal lobe , 1995 .

[65]  Lee M. Miller,et al.  Measuring interregional functional connectivity using coherence and partial coherence analyses of fMRI data , 2004, NeuroImage.

[66]  A. Friederici,et al.  Brain activity varies with modulation of dynamic pitch variance in sentence melody , 2004, Brain and Language.

[67]  E. C. Cmm,et al.  on the Recognition of Speech, with , 2008 .

[68]  A. Giraud,et al.  Severity of dysfluency correlates with basal ganglia activity in persistent developmental stuttering , 2008, Brain and Language.

[69]  R A Andersen,et al.  Multimodal integration for the representation of space in the posterior parietal cortex. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[70]  R. Campbell,et al.  Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex , 2000, Current Biology.

[71]  Jeffery A. Jones,et al.  Neural processes underlying perceptual enhancement by visual speech gestures , 2003, Neuroreport.

[72]  P. Matthews,et al.  Defining a left-lateralized response specific to intelligible speech using fMRI. , 2003, Cerebral cortex.

[73]  Philip Lieberman,et al.  Selective speech motor, syntax and cognitive deficits associated with bilateral damage to the putamen and the head of the caudate nucleus: a case study , 1998, Neuropsychologia.

[74]  D. Pandya,et al.  Corticostriatal connections of the superior temporal region in rhesus monkeys , 1998, The Journal of comparative neurology.

[75]  Donald G. MacKay,et al.  H.M. Revisited: Relations between Language Comprehension, Memory, and the Hippocampal System , 1998, Journal of Cognitive Neuroscience.

[76]  Riitta Hari,et al.  Audiovisual Integration of Letters in the Human Brain , 2000, Neuron.

[77]  Jonathan D. Cohen,et al.  Conflict monitoring and anterior cingulate cortex: an update , 2004, Trends in Cognitive Sciences.

[78]  E. Bullmore,et al.  Response amplification in sensory-specific cortices during crossmodal binding. , 1999, Neuroreport.

[79]  Jeremy I. Skipper,et al.  Seeing Voices : How Cortical Areas Supporting Speech Production Mediate Audiovisual Speech Perception , 2007 .

[80]  P Sterzer,et al.  Contributions of sensory input, auditory search and verbal comprehension to cortical activity during speech processing. , 2004, Cerebral cortex.

[81]  Pratik Mukherjee,et al.  Subcortical pathways serving cortical language sites: initial experience with diffusion tensor imaging fiber tracking combined with intraoperative language mapping , 2004, NeuroImage.

[82]  S. Yantis,et al.  Control of Attention Shifts between Vision and Audition in Human Cortex , 2004, The Journal of Neuroscience.

[83]  P. Dupont,et al.  Word reading and posterior temporal dysfunction in amnestic mild cognitive impairment. , 2006, Cerebral cortex.

[84]  John J. Foxe,et al.  Audio-visual multisensory integration in superior parietal lobule revealed by human intracranial recordings. , 2006, Journal of neurophysiology.

[85]  M. Iacoboni,et al.  Listening to speech activates motor areas involved in speech production , 2004, Nature Neuroscience.

[86]  H. Scheich,et al.  Functional magnetic resonance imaging of a human auditory cortex area involved in foreground–background decomposition , 1998, The European journal of neuroscience.

[87]  R. Wise,et al.  Temporal lobe regions engaged during normal speech comprehension. , 2003, Brain : a journal of neurology.

[88]  Steven L. Small,et al.  Listening to talking faces: motor cortical activation during speech perception , 2005, NeuroImage.

[89]  E Macaluso,et al.  Spatial and temporal factors during processing of audiovisual speech: a PET study , 2004, NeuroImage.

[90]  S. Scott,et al.  Identification of a pathway for intelligible speech in the left temporal lobe. , 2000, Brain : a journal of neurology.

[91]  Jeffrey R Binder,et al.  Human brain regions involved in recognizing environmental sounds. , 2004, Cerebral cortex.

[92]  M. Woldorff,et al.  The neural circuitry underlying the executive control of auditory spatial attention , 2007, Brain Research.

[93]  P. McGuire,et al.  Cortical substrates for the perception of face actions: an fMRI study of the specificity of activation for seen speech and for meaningless lower-face acts (gurning). , 2001, Brain research. Cognitive brain research.

[94]  Jeffery A. Jones,et al.  Multisensory Integration Sites Identified by Perception of Spatial Wavelet Filtered Visual Speech Gesture Information , 2004, Journal of Cognitive Neuroscience.