Comparison of Two-Talker Attention Decoding from EEG with Nonlinear Neural Networks and Linear Methods

Auditory attention decoding (AAD) through a brain-computer interface has had a flowering of developments since it was first introduced by Mesgarani and Chang (2012) using electrocorticograph recordings. AAD has been pursued for its potential application to hearing-aid design in which an attention-guided algorithm selects, from multiple competing acoustic sources, which should be enhanced for the listener and which should be suppressed. Traditionally, researchers have separated the AAD problem into two stages: reconstruction of a representation of the attended audio from neural signals, followed by determining the similarity between the candidate audio streams and the reconstruction. Here, we compare the traditional two-stage approach with a novel neural-network architecture that subsumes the explicit similarity step. We compare this new architecture against linear and non-linear (neural-network) baselines using both wet and dry electroencephalogram (EEG) systems. Our results indicate that the new architecture outperforms the baseline linear stimulus-reconstruction method, improving decoding accuracy from 66% to 81% using wet EEG and from 59% to 87% for dry EEG. Also of note was the finding that the dry EEG system can deliver comparable or even better results than the wet, despite the latter having one third as many EEG channels as the former. The 11-subject, wet-electrode AAD dataset for two competing, co-located talkers, the 11-subject, dry-electrode AAD dataset, and our software are available for further validation, experimentation, and modification.

[1]  Blake S Wilson,et al.  Global hearing health care: new findings and perspectives , 2017, The Lancet.

[2]  C. C. Andrade,et al.  The silent impact of hearing loss: using longitudinal data to explore the effects on depression and social activity restriction among older people , 2017, Ageing and Society.

[3]  Hans-Jochen Heinze,et al.  Systematic comparison between a wireless EEG system with dry electrodes and a wired EEG system with wet electrodes , 2019, NeuroImage.

[4]  Birger Kollmeier,et al.  Machine learning for decoding listeners’ attention from electroencephalography evoked by continuous speech , 2020, The European journal of neuroscience.

[5]  J. Betz,et al.  Hearing Loss and Depression in Older Adults , 2013, Journal of the American Geriatrics Society.

[6]  Lucas S. Baltzell,et al.  Attention selectively modulates cortical entrainment in different regions of the speech spectrum , 2016, Brain Research.

[7]  Zhuo Chen,et al.  Neural decoding of attentional selection in multi-speaker environments without access to separated sources , 2017, 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[8]  Carlos Francisco Mendoza,et al.  Decoding Auditory Attention from Multivariate Neural Data using Cepstral Analysis , 2018 .

[9]  Maarten De Vos,et al.  Auditory attention decoding with EEG recordings using noisy acoustic reference signals , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Maarten De Vos,et al.  Decoding the attended speech stream with multi-channel EEG: implications for online, daily-life applications , 2015, Journal of neural engineering.

[11]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[12]  Jonathan Z. Simon,et al.  Real-Time Tracking of Selective Auditory Attention From M/EEG: A Bayesian Filtering Approach , 2017, bioRxiv.

[13]  Loukianos Spyrou,et al.  Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on , 2017, ICASSP 2017.

[14]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[15]  Alexander Bertrand,et al.  EEG-based auditory attention detection: boundary conditions for background noise and speaker positions. , 2018, Journal of neural engineering.

[16]  Malcolm Slaney,et al.  A Comparison of Regularization Methods in Forward and Backward Models for Auditory Attention Decoding , 2018, Front. Neurosci..

[17]  D. Poeppel,et al.  Mechanisms Underlying Selective Neuronal Tracking of Attended Speech at a “Cocktail Party” , 2013, Neuron.

[18]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[19]  Alexander Bertrand,et al.  EEG-Informed Attended Speaker Extraction From Recorded Speech Mixtures With Application in Neuro-Steered Hearing Prostheses , 2016, IEEE Transactions on Biomedical Engineering.

[20]  Thomas F. Quatieri,et al.  A vocal modulation model with application to predicting depression severity , 2016, 2016 IEEE 13th International Conference on Wearable and Implantable Body Sensor Networks (BSN).

[21]  Alessandro Presacco,et al.  Robust decoding of selective auditory attention from MEG in a competing-speaker environment via state-space modeling , 2016, NeuroImage.

[22]  Alexander Bertrand,et al.  Online detection of auditory attention with mobile EEG: closing the loop with neurofeedback , 2017, bioRxiv.

[23]  L. Tarassenko,et al.  Engineering in Medicine and Biology Society (EMBC), 2014 36th Annual International Conference of the IEEE , 2014 .

[24]  Alexander Bertrand,et al.  Auditory-Inspired Speech Envelope Extraction Methods for Improved EEG-Based Auditory Attention Detection in a Cocktail Party Scenario , 2017, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[25]  Torsten Dau,et al.  Noise-robust cortical tracking of attended speech in real-world acoustic scenes , 2017, NeuroImage.

[26]  T. Picton,et al.  Human Cortical Responses to the Speech Envelope , 2008, Ear and hearing.

[27]  Torsten Dau,et al.  Towards cognitive control of hearing instruments using EEG measures of selective attention , 2018 .

[28]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[29]  Thomas Lunner,et al.  Single-channel in-ear-EEG detects the focus of auditory attention to concurrent tone streams and mixed speech , 2016, bioRxiv.

[30]  Christopher. Simons,et al.  Machine learning with Python , 2017 .

[31]  Alain de Cheveigné,et al.  Decoding the auditory brain with canonical component analysis , 2017, NeuroImage.

[32]  J. Simon,et al.  Neural coding of continuous speech in auditory cortex during monaural and dichotic listening. , 2012, Journal of neurophysiology.

[33]  Sergei Kochkin,et al.  MarkeTrak VII: Customer satisfaction with hearing instruments in the digital age , 2005 .

[34]  Stefan Debener,et al.  Identifying auditory attention with ear-EEG: cEEGrid versus high-density cap-EEG comparison , 2016, Journal of neural engineering.

[35]  K. Tremblay,et al.  How Neuroscience Relates to Hearing Aid Amplification , 2014, International journal of otolaryngology.

[36]  Satrajit S. Ghosh,et al.  Nipype: A Flexible, Lightweight and Extensible Neuroimaging Data Processing Framework in Python , 2011, Front. Neuroinform..

[37]  N. Mesgarani,et al.  Selective cortical representation of attended speaker in multi-talker speech perception , 2012, Nature.

[38]  Stig Arlinger,et al.  Negative consequences of uncorrected hearing loss—a review , 2003, International journal of audiology.

[39]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[40]  John J. Foxe,et al.  Attentional Selection in a Cocktail Party Environment Can Be Decoded from Single-Trial EEG. , 2015, Cerebral cortex.

[41]  N. Lesica Why Do Hearing Aids Fail to Restore Normal Auditory Perception? , 2018, Trends in Neurosciences.