An Interpretable Performance Metric for Auditory Attention Decoding Algorithms in a Context of Neuro-Steered Gain Control

In a multi-speaker scenario, a hearing aid lacks information on which speaker the user intends to attend, and therefore it often mistakenly treats the latter as noise while enhancing an interfering speaker. Recently, it has been shown that it is possible to decode the attended speaker from brain activity, e.g., recorded by electroencephalography sensors. While numerous of these auditory attention decoding (AAD) algorithms appeared in the literature, their performance is generally evaluated in a non-uniform manner, where trade-offs between the AAD accuracy and the time needed to make an AAD decision are not properly incorporated. We present an interpretable performance metric to evaluate AAD algorithms, based on an adaptive gain control system, steered by AAD decisions. Such a system can be modeled as a Markov chain, from which the minimal expected switch duration (MESD) can be calculated and interpreted as the expected time required to switch the operation of the hearing aid after an attention switch of the user, thereby resolving the trade-off between AAD accuracy and decision time. Furthermore, we show that the MESD calculation provides an automatic and theoretically founded procedure to optimize the step size and decision frequency in an AAD-based adaptive gain control system.

[1]  John J. Foxe,et al.  Attentional Selection in a Cocktail Party Environment Can Be Decoded from Single-Trial EEG. , 2015, Cerebral cortex.

[2]  Malcolm Slaney,et al.  A Comparison of Regularization Methods in Forward and Backward Models for Auditory Attention Decoding , 2018, Front. Neurosci..

[3]  Alexander Bertrand,et al.  EEG-based auditory attention detection: boundary conditions for background noise and speaker positions. , 2018, Journal of neural engineering.

[4]  N. Mesgarani,et al.  Selective cortical representation of attended speaker in multi-talker speech perception , 2012, Nature.

[5]  Simon Van Eyndhoven,et al.  EEG-based attention-driven speech enhancement for noisy speech mixtures using N-fold multi-channel Wiener filters , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[6]  N. Birbaumer,et al.  BCI2000: a general-purpose brain-computer interface (BCI) system , 2004, IEEE Transactions on Biomedical Engineering.

[7]  A. Stewart,et al.  Listening effort and fatigue: What exactly are we measuring? A British Society of Audiology Cognition in Hearing Special Interest Group ‘white paper’ , 2014, International journal of audiology.

[8]  Alexander Bertrand,et al.  Auditory-Inspired Speech Envelope Extraction Methods for Improved EEG-Based Auditory Attention Detection in a Cocktail Party Scenario , 2017, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[9]  Torsten Dau,et al.  Noise-robust cortical tracking of attended speech in real-world acoustic scenes , 2017, NeuroImage.

[10]  T. Picton,et al.  Human Cortical Responses to the Speech Envelope , 2008, Ear and hearing.

[11]  D. Poeppel,et al.  Mechanisms Underlying Selective Neuronal Tracking of Attended Speech at a “Cocktail Party” , 2013, Neuron.

[12]  Simon Doclo,et al.  Cognitive-driven Binaural LCMV Beamformer Using EEG-based Auditory Attention Decoding , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Alain de Cheveigné,et al.  Decoding the auditory brain with canonical component analysis , 2017, NeuroImage.

[14]  Nima Mesgarani,et al.  Speaker-independent auditory attention decoding without access to clean speech sources , 2019, Science Advances.

[15]  G Pfurtscheller,et al.  EEG-based communication: improved accuracy by response verification. , 1998, IEEE transactions on rehabilitation engineering : a publication of the IEEE Engineering in Medicine and Biology Society.

[16]  John Odentrantz,et al.  Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues , 2000, Technometrics.

[17]  Alexander Bertrand,et al.  EEG-Informed Attended Speaker Extraction From Recorded Speech Mixtures With Application in Neuro-Steered Hearing Prostheses , 2016, IEEE Transactions on Biomedical Engineering.

[18]  Antoine J. Shahin,et al.  Attentional Gain Control of Ongoing Cortical Speech Representations in a “Cocktail Party” , 2010, The Journal of Neuroscience.

[19]  Birger Kollmeier,et al.  Machine learning for decoding listeners’ attention from electroencephalography evoked by continuous speech , 2020, The European journal of neuroscience.

[20]  Tom Francart,et al.  A New Metric to Evaluate Auditory Attention Detection Performance Based on a Markov Chain , 2019, 2019 27th European Signal Processing Conference (EUSIPCO).

[21]  Zhuo Chen,et al.  Neural decoding of attentional selection in multi-speaker environments without access to clean sources , 2017, Journal of neural engineering.

[22]  Simon Doclo,et al.  Impact of Different Acoustic Components on EEG-based Auditory Attention Decoding in Noisy and Reverberant Conditions , 2018, bioRxiv.

[23]  E. C. Cmm,et al.  on the Recognition of Speech, with , 2008 .

[24]  Thomas Lunner,et al.  Impact of stimulus-related factors and hearing impairment on listening effort as indicated by pupil dilation , 2017, Hearing Research.

[25]  Jonathan Z. Simon,et al.  Real-Time Tracking of Selective Auditory Attention From M/EEG: A Bayesian Filtering Approach , 2017, bioRxiv.

[26]  Thomas Lunner,et al.  A Tutorial on Auditory Attention Identification Methods , 2019, Front. Neurosci..

[27]  Tom Francart,et al.  The Self-Assessed Békesy Procedure: Validation of a Method to Measure Intelligibility of Connected Discourse , 2018, Trends in hearing.