Auditory and Language Contributions to Neural Encoding of Speech Features in Noisy Environments

Recognizing speech in noisy environments is a challenging task that involves both auditory and language mechanisms. Previous studies have demonstrated noise-robust neural tracking of the speech envelope, i.e., fluctuations in sound intensity, in human auditory cortex, which provides a plausible neural basis for noise-robust speech recognition. The current study aims at teasing apart auditory and language contributions to noise-robust envelope tracking by comparing 2 groups of listeners, i.e., native listeners of the testing language and foreign listeners who do not understand the testing language. In the experiment, speech is mixed with spectrally matched stationary noise at 4 intensity levels and the neural responses are recorded using electroencephalography (EEG). When the noise intensity increases, an increase in neural response gain is observed for both groups of listeners, demonstrating auditory gain control mechanisms. Language comprehension creates no overall boost in the response gain or the envelope-tracking precision but instead modulates the spatial and temporal profiles of envelope-tracking activity. Based on the spatio-temporal dynamics of envelope-tracking activity, the 2 groups of listeners and the 4 levels of noise intensity can be jointly decoded by a linear classifier. All together, the results show that without feedback from language processing, auditory mechanisms such as gain control can lead to a noise-robust speech representation. High-level language processing, however, further modulates the spatial-temporal profiles of the neural representation of the speech envelope.

[1]  W. Ganong Phonetic categorization in auditory word perception. , 1980, Journal of experimental psychology. Human perception and performance.

[2]  Ying-Yee Kong,et al.  Effects of Spectral Degradation on Attentional Modulation of Cortical Auditory Responses to Continuous Speech , 2015, Journal of the Association for Research in Otolaryngology.

[3]  G. A. Miller,et al.  The intelligibility of speech as a function of the context of the test materials. , 1951, Journal of experimental psychology.

[4]  I. Dean,et al.  Neural population coding of sound level adapts to stimulus statistics , 2005, Nature Neuroscience.

[5]  D. Poeppel,et al.  Sensitivity to temporal modulation rate and spectral bandwidth in the human auditory system: fMRI evidence. , 2012, Journal of neurophysiology.

[6]  David Poeppel,et al.  Acoustic landmarks drive delta–theta oscillations to enable speech comprehension by facilitating perceptual parsing , 2014, NeuroImage.

[7]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[8]  Wen Zhang,et al.  Attention Is Required for Knowledge-Based Sequential Grouping: Insights from the Integration of Syllables into Words , 2017, The Journal of Neuroscience.

[9]  A. Bregman,et al.  Demonstrations of auditory scene analysis : the perceptual organization of sound , 1995 .

[10]  Thomas Lunner,et al.  Neural tracking of attended versus ignored speech is differentially affected by hearing loss. , 2017, Journal of neurophysiology.

[11]  David Poeppel,et al.  Characterizing Neural Entrainment to Hierarchical Linguistic Units using Electroencephalography (EEG) , 2017, Front. Hum. Neurosci..

[12]  Ramesh Srinivasan,et al.  The effect of prior knowledge and intelligibility on the cortical entrainment response to speech. , 2017, Journal of neurophysiology.

[13]  N. Mesgarani,et al.  Selective cortical representation of attended speaker in multi-talker speech perception , 2012, Nature.

[14]  Lucia Melloni,et al.  Brain Oscillations during Spoken Sentence Processing , 2012, Journal of Cognitive Neuroscience.

[15]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[16]  D. Poeppel,et al.  Phase Patterns of Neuronal Responses Reliably Discriminate Speech in Human Auditory Cortex , 2007, Neuron.

[17]  David G. Stork,et al.  Pattern Classification , 1973 .

[18]  Neil C. Rabinowitz,et al.  Constructing Noise-Invariant Representations of Sound in the Auditory Pathway , 2013, PLoS biology.

[19]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[20]  Antoine J. Shahin,et al.  Attentional Gain Control of Ongoing Cortical Speech Representations in a “Cocktail Party” , 2010, The Journal of Neuroscience.

[21]  Christopher K. Kovach,et al.  Temporal Envelope of Time-Compressed Speech Represented in the Human Auditory Cortex , 2009, The Journal of Neuroscience.

[22]  Nai Ding,et al.  Prior Knowledge Guides Speech Segregation in Human Auditory Cortex , 2019, Cerebral cortex.

[23]  D. Poeppel,et al.  Mechanisms Underlying Selective Neuronal Tracking of Attended Speech at a “Cocktail Party” , 2013, Neuron.

[24]  J. Simon,et al.  Cortical entrainment to continuous speech: functional roles and interpretations , 2014, Front. Hum. Neurosci..

[25]  P. Schyns,et al.  Speech Rhythms and Multiplexed Oscillatory Sensory Coding in the Human Brain , 2013, PLoS biology.

[26]  Jonathan Z. Simon,et al.  Adaptive Temporal Encoding Leads to a Background-Insensitive Cortical Representation of Speech , 2013, The Journal of Neuroscience.

[27]  Robin A. A. Ince,et al.  Frontal Top-Down Signals Increase Coupling of Auditory Low-Frequency Oscillations to Continuous Speech in Human Listeners , 2015, Current Biology.

[28]  Robin A A Ince,et al.  Irregular Speech Rate Dissociates Auditory Cortical Entrainment, Evoked Responses, and Frontal Alpha , 2015, The Journal of Neuroscience.

[29]  Garreth Prendergast,et al.  The Role of Phase-locking to the Temporal Envelope of Speech in Auditory Perception and Speech Intelligibility , 2015, Journal of Cognitive Neuroscience.

[30]  Jonathan Z. Simon,et al.  Robust cortical entrainment to the speech envelope relies on the spectro-temporal fine structure , 2014, NeuroImage.

[31]  J. Rauschecker,et al.  Perceptual Organization of Tone Sequences in the Auditory Cortex of Awake Macaques , 2005, Neuron.

[32]  Virginia Best,et al.  The role of syntax in maintaining the integrity of streams of speech. , 2014, The Journal of the Acoustical Society of America.

[33]  Benedikt Zoefel,et al.  EEG oscillations entrain their phase to high-level features of speech sound , 2016, NeuroImage.

[34]  James W. Minett,et al.  Delta, theta, beta, and gamma brain oscillations index levels of auditory sentence processing , 2016, NeuroImage.

[35]  David Poeppel,et al.  Sensitivity to temporal modulation rate and spectral bandwidth in the human auditory system: MEG evidence. , 2012, Journal of neurophysiology.

[36]  John J. Foxe,et al.  Attentional Selection in a Cocktail Party Environment Can Be Decoded from Single-Trial EEG. , 2015, Cerebral cortex.

[37]  J. Simon,et al.  Emergence of neural encoding of auditory objects while listening to competing speakers , 2012, Proceedings of the National Academy of Sciences.

[38]  D. Poeppel,et al.  Cortical Tracking of Hierarchical Linguistic Structures in Connected Speech , 2015, Nature Neuroscience.

[39]  Marco Buiatti,et al.  Investigating the neural correlates of continuous speech computation with frequency-tagged neuroelectric responses , 2009, NeuroImage.

[40]  David Poeppel,et al.  Discrimination of speech stimuli based on neuronal response phase patterns depends on acoustics but not comprehension. , 2010, Journal of neurophysiology.

[41]  C. Schroeder,et al.  Predictive Suppression of Cortical Excitability and Its Deficit in Schizophrenia , 2013, The Journal of Neuroscience.

[42]  R. Freyman,et al.  Effect of Priming on Energetic and Informational Masking in a Same–Different Task , 2012, Ear and hearing.

[43]  S. David,et al.  Rapid Synaptic Depression Explains Nonlinear Modulation of Spectro-Temporal Tuning in Primary Auditory Cortex by Natural Stimuli , 2009, The Journal of Neuroscience.

[44]  R. M. Warren Perceptual Restoration of Missing Speech Sounds , 1970, Science.

[45]  Qiang Huang,et al.  The effect of voice cuing on releasing Chinese speech from informational masking , 2007, Speech Commun..

[46]  Jan Wouters,et al.  Speech Intelligibility Predicted from Neural Entrainment of the Speech Envelope , 2018, bioRxiv.

[47]  R. Plomp,et al.  Effect of temporal envelope smearing on speech reception. , 1994, The Journal of the Acoustical Society of America.

[48]  Joachim Gross,et al.  Phase-Locked Responses to Speech in Human Auditory Cortex are Enhanced During Comprehension , 2012, Cerebral cortex.

[49]  Ying-Yee Kong,et al.  Differential modulation of auditory responses to attended and unattended speech in different listening conditions , 2014, Hearing Research.

[50]  R. Freyman,et al.  The role of visual speech cues in reducing energetic and informational masking. , 2005, The Journal of the Acoustical Society of America.

[51]  C E Schreiner,et al.  Neural processing of amplitude-modulated sounds. , 2004, Physiological reviews.

[52]  John J. Foxe,et al.  Resolving precise temporal processing properties of the auditory system using continuous stimuli. , 2009, Journal of neurophysiology.

[53]  J. Simon,et al.  Neural coding of continuous speech in auditory cortex during monaural and dichotic listening. , 2012, Journal of neurophysiology.

[54]  D. McAlpine,et al.  Gain control mechanisms in the auditory pathway , 2009, Current Opinion in Neurobiology.

[55]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[56]  Neil C. Rabinowitz,et al.  Contrast Gain Control in Auditory Cortex , 2011, Neuron.

[57]  J. Simon,et al.  Evidence of degraded representation of speech in noise, in the aging midbrain and cortex. , 2016, Journal of neurophysiology.

[58]  Kirill V. Nourski,et al.  Representation of speech in human auditory cortex: Is it special? , 2013, Hearing Research.