Decoding of selective attention to continuous speech from the human auditory brainstem response

Humans are highly skilled at analysing complex acoustic scenes. The segregation of different acoustic streams and the formation of corresponding neural representations is mostly attributed to the auditory cortex. Decoding of selective attention from neuroimaging has therefore focussed on cortical responses to sound. However, the auditory brainstem response to speech is modulated by selective attention as well, as recently shown through measuring the brainstem's response to running speech. Although the response of the auditory brainstem has a smaller magnitude than that of the auditory cortex, it occurs at much higher frequencies and therefore has a higher information rate. Here we develop statistical models for extracting the brainstem response from multi-channel scalp recordings and for analysing the attentional modulation according to the focus of attention. We demonstrate that the attentional modulation of the brainstem response to speech can be employed to decode the attentional focus of a listener from short measurements of 10 s or less in duration. The decoding remains accurate when obtained from three EEG channels only. We further show how out-of-the-box decoding that employs subject-independent models, as well as decoding that is independent of the specific attended speaker is capable of achieving similar accuracy. These results open up new avenues for investigating the neural mechanisms for selective attention in the brainstem and for developing efficient auditory brain-computer interfaces.

[1]  G. Bidelman Multichannel recordings of the human brainstem frequency-following response: Scalp topography, source generators, and distinctions from the transient ABR , 2015, Hearing Research.

[2]  Ramesh Srinivasan,et al.  Suppression of competing speech through entrainment of cortical oscillations. , 2013, Journal of neurophysiology.

[3]  Y. Mochizuki,et al.  [The auditory brainstem response]. , 1989, No to hattatsu = Brain and development.

[4]  David M. Groppe,et al.  Mass univariate analysis of event-related brain potentials/fields I: a critical tutorial review. , 2011, Psychophysiology.

[5]  Sylvain Baillet,et al.  Cortical contributions to the auditory frequency-following response revealed by MEG , 2016, Nature Communications.

[6]  Nima Mesgarani,et al.  Speech reconstruction from human auditory cortex with deep neural networks , 2015, INTERSPEECH.

[7]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[8]  David Poeppel,et al.  Cortical oscillations and speech processing: emerging computational principles and operations , 2012, Nature Neuroscience.

[9]  D. McFarland,et al.  An auditory brain–computer interface (BCI) , 2008, Journal of Neuroscience Methods.

[10]  Hai Huang,et al.  Speech pitch determination based on Hilbert-Huang transform , 2006, Signal Process..

[11]  Jayaganesh Swaminathan,et al.  Benefits of Acoustic Beamforming for Solving the Cocktail Party Problem , 2015, Trends in hearing.

[12]  Maarten De Vos,et al.  Decoding the attended speech stream with multi-channel EEG: implications for online, daily-life applications , 2015, Journal of neural engineering.

[13]  Ferdinando Grandori,et al.  Field analysis of auditory evoked brainstem potentials , 1986, Hearing Research.

[14]  Klaus Nordhausen,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition by Trevor Hastie, Robert Tibshirani, Jerome Friedman , 2009 .

[15]  N. Mesgarani,et al.  Selective cortical representation of attended speaker in multi-talker speech perception , 2012, Nature.

[16]  Stefan Haufe,et al.  On the interpretation of weight vectors of linear models in multivariate neuroimaging , 2014, NeuroImage.

[17]  Erika Skoe,et al.  Neural Processing of Speech Sounds in ASD and First-Degree Relatives , 2010, Journal of Autism and Developmental Disorders.

[18]  Philipp Berens,et al.  CircStat: AMATLABToolbox for Circular Statistics , 2009, Journal of Statistical Software.

[19]  L. W. Norrix,et al.  Multichannel waveforms and topographic mapping of the auditory brainstem response under common stimulus and recording conditions. , 1996, Journal of communication disorders.

[20]  John J. Foxe,et al.  At what time is the cocktail party? A late locus of selective attention to natural speech , 2012, The European journal of neuroscience.

[21]  Tobias Reichenbach,et al.  The human auditory brainstem response to running speech reveals a subcortical mechanism for selective attention , 2017, bioRxiv.

[22]  J. Simon,et al.  Cortical entrainment to continuous speech: functional roles and interpretations , 2014, Front. Hum. Neurosci..

[23]  Alexander Bertrand,et al.  EEG-Informed Attended Speaker Extraction From Recorded Speech Mixtures With Application in Neuro-Steered Hearing Prostheses , 2016, IEEE Transactions on Biomedical Engineering.

[24]  Jonathan Z. Simon,et al.  Real-Time Tracking of Selective Auditory Attention From M/EEG: A Bayesian Filtering Approach , 2017, bioRxiv.

[25]  Karim Jerbi,et al.  Exceeding chance level by chance: The caveat of theoretical chance levels in brain signal classification and statistical assessment of decoding accuracy , 2015, Journal of Neuroscience Methods.

[26]  Ehud Weinstein,et al.  Signal enhancement using beamforming and nonstationarity with applications to speech , 2001, IEEE Trans. Signal Process..

[27]  Antoine J. Shahin,et al.  Attentional Gain Control of Ongoing Cortical Speech Representations in a “Cocktail Party” , 2010, The Journal of Neuroscience.

[28]  Eiichi Kato,et al.  The Scalp Topography of ABR , 1984 .

[29]  Stefan Debener,et al.  Identifying auditory attention with ear-EEG: cEEGrid versus high-density cap-EEG comparison , 2016, Journal of neural engineering.

[30]  Michael Tangermann,et al.  Listen, You are Writing! Speeding up Online Spelling with a Dynamic Auditory BCI , 2011, Front. Neurosci..

[31]  Gavin M. Bidelman,et al.  Subcortical sources dominate the neuroelectric auditory frequency-following response to speech , 2018, NeuroImage.

[32]  Alexander Bertrand,et al.  Auditory-Inspired Speech Envelope Extraction Methods for Improved EEG-Based Auditory Attention Detection in a Cocktail Party Scenario , 2017, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[33]  Torsten Dau,et al.  Noise-robust cortical tracking of attended speech in real-world acoustic scenes , 2017, NeuroImage.

[34]  Martin Luessi,et al.  MNE software for processing MEG and EEG data , 2014, NeuroImage.

[35]  Jonathan Z. Simon,et al.  The Auditory System at the Cocktail Party , 2017 .

[36]  A. Dale,et al.  Improved Localizadon of Cortical Activity by Combining EEG and MEG with MRI Cortical Surface Reconstruction: A Linear Approach , 1993, Journal of Cognitive Neuroscience.

[37]  Tobias Reichenbach,et al.  The Auditory-Brainstem Response to Continuous, Non-repetitive Speech Is Modulated by the Speech Envelope and Reflects Speech Processing , 2016, Front. Comput. Neurosci..

[38]  Alain de Cheveigné,et al.  Decoding the auditory brain with canonical component analysis , 2017, NeuroImage.

[39]  C James,et al.  Speech perception in noise with implant and hearing aid. , 1997, The American journal of otology.

[40]  Martin Luessi,et al.  MEG and EEG data analysis with MNE-Python , 2013, Front. Neuroinform..

[41]  Adrian K. C. Lee,et al.  Auditory Brainstem Responses to Continuous Natural Speech in Human Listeners , 2017, eNeuro.

[42]  F. Piccione,et al.  P300-based brain computer interface: Reliability and performance in healthy and paralysed participants , 2006, Clinical Neurophysiology.

[43]  John J. Foxe,et al.  Attentional Selection in a Cocktail Party Environment Can Be Decoded from Single-Trial EEG. , 2015, Cerebral cortex.

[44]  J. Simon,et al.  Emergence of neural encoding of auditory objects while listening to competing speakers , 2012, Proceedings of the National Academy of Sciences.

[45]  Thomas Lunner,et al.  Single-channel in-ear-EEG detects the focus of auditory attention to concurrent tone streams and mixed speech , 2016, bioRxiv.