Speech Intelligibility Predicted from Neural Entrainment of the Speech Envelope

Speech intelligibility is currently measured by scoring how well a person can identify a speech signal. The results of such behavioral measures reflect neural processing of the speech signal, but are also influenced by language processing, motivation, and memory. Very often, electrophysiological measures of hearing give insight in the neural processing of sound. However, in most methods, non-speech stimuli are used, making it hard to relate the results to behavioral measures of speech intelligibility. The use of natural running speech as a stimulus in electrophysiological measures of hearing is a paradigm shift which allows to bridge the gap between behavioral and electrophysiological measures. Here, by decoding the speech envelope from the electroencephalogram, and correlating it with the stimulus envelope, we demonstrate an electrophysiological measure of neural processing of running speech. We show that behaviorally measured speech intelligibility is strongly correlated with our electrophysiological measure. Our results pave the way towards an objective and automatic way of assessing neural processing of speech presented through auditory prostheses, reducing confounds such as attention and cognitive capabilities. We anticipate that our electrophysiological measure will allow better differential diagnosis of the auditory system, and will allow the development of closed-loop auditory prostheses that automatically adapt to individual users.

[1]  Piotr Majdak,et al.  The Auditory Modeling Toolbox , 2013 .

[2]  J. Simon,et al.  Cortical entrainment to continuous speech: functional roles and interpretations , 2014, Front. Hum. Neurosci..

[3]  Erik Edwards,et al.  Syllabic (∼2–5 Hz) and fluctuation (∼1–10 Hz) ranges in speech and auditory processing , 2013, Hearing Research.

[4]  Christoph E Schreiner,et al.  Human Superior Temporal Gyrus Organization of Spectrotemporal Modulation Tuning Derived from Speech Stimuli , 2016, The Journal of Neuroscience.

[5]  Brian N. Pasley,et al.  Reconstructing Speech from Human Auditory Cortex , 2012, PLoS biology.

[6]  Jonathan Z. Simon,et al.  Robust cortical entrainment to the speech envelope relies on the spectro-temporal fine structure , 2014, NeuroImage.

[7]  Alexander Bertrand,et al.  Auditory-Inspired Speech Envelope Extraction Methods for Improved EEG-Based Auditory Attention Detection in a Cocktail Party Scenario , 2017, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[8]  T. Picton,et al.  Human Cortical Responses to the Speech Envelope , 2008, Ear and hearing.

[9]  Heleen Luts,et al.  Development and normative data for the Flemish/Dutch Matrix test , 2014 .

[10]  R. Plomp,et al.  Effect of temporal envelope smearing on speech reception. , 1994, The Journal of the Acoustical Society of America.

[11]  M. D'Zmura,et al.  Envelope responses in single-trial EEG indicate attended speaker in a ‘cocktail party’ , 2014, Journal of Neural Engineering.

[12]  Edmund C. Lalor,et al.  Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing , 2015, Current Biology.

[13]  J. Simon,et al.  Evidence of degraded representation of speech in noise, in the aging midbrain and cortex. , 2016, Journal of neurophysiology.

[14]  Nima Mesgarani,et al.  Speech reconstruction from human auditory cortex with deep neural networks , 2015, INTERSPEECH.

[15]  Y. Mochizuki,et al.  [The auditory brainstem response]. , 1989, No to hattatsu = Brain and development.

[16]  Ying-Yee Kong,et al.  Effects of Spectral Degradation on Attentional Modulation of Cortical Auditory Responses to Continuous Speech , 2015, Journal of the Association for Research in Otolaryngology.

[17]  Barak A. Pearlmutter,et al.  The VESPA: A method for the rapid estimation of a visual evoked potential , 2006, NeuroImage.

[18]  J D Clemis,et al.  The Approximation of Audiometric Thresholds by Auditory Brain Stem Responses , 1980, Otolaryngology and head and neck surgery.

[19]  Nina Kraus,et al.  Auditory brainstem response to complex sounds predicts self-reported speech-in-noise performance. , 2013, Journal of speech, language, and hearing research : JSLHR.

[20]  Bruno Torrésani,et al.  The Linear Time Frequency Analysis Toolbox , 2012, Int. J. Wavelets Multiresolution Inf. Process..

[21]  Terence W Picton,et al.  Estimating audiometric thresholds using auditory steady-state responses. , 2005, Journal of the American Academy of Audiology.

[22]  John J. Foxe,et al.  Attentional Selection in a Cocktail Party Environment Can Be Decoded from Single-Trial EEG. , 2015, Cerebral cortex.

[23]  J. Simon,et al.  Emergence of neural encoding of auditory objects while listening to competing speakers , 2012, Proceedings of the National Academy of Sciences.

[24]  Jan Wouters,et al.  APEX 3: a multi-purpose test platform for auditory psychophysical experiments , 2008, Journal of Neuroscience Methods.

[25]  John J. Foxe,et al.  Resolving precise temporal processing properties of the auditory system using continuous stimuli. , 2009, Journal of neurophysiology.

[26]  J. Simon,et al.  Neural coding of continuous speech in auditory cortex during monaural and dichotic listening. , 2012, Journal of neurophysiology.

[27]  Michael A. Akeroyd,et al.  The role of segmentation difficulties in speech-in-speech understanding in older and hearing-impaired adults. , 2010, The Journal of the Acoustical Society of America.

[28]  Matthew H. Davis,et al.  Neural Oscillations Carry Speech Rhythm through to Comprehension , 2012, Front. Psychology.

[29]  R. Plomp,et al.  Effect of reducing slow temporal modulations on speech reception. , 1994, The Journal of the Acoustical Society of America.

[30]  Jonathan Z. Simon,et al.  Adaptive Temporal Encoding Leads to a Background-Insensitive Cortical Representation of Speech , 2013, The Journal of Neuroscience.

[31]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[32]  David Poeppel,et al.  Acoustic landmarks drive delta–theta oscillations to enable speech comprehension by facilitating perceptual parsing , 2014, NeuroImage.