论文信息 - Objective Prediction of Hearing Aid Benefit Across Listener Groups Using Machine Learning: Speech Recognition Performance With Binaural Noise-Reduction Algorithms

Objective Prediction of Hearing Aid Benefit Across Listener Groups Using Machine Learning: Speech Recognition Performance With Binaural Noise-Reduction Algorithms

The simulation framework for auditory discrimination experiments (FADE) was adopted and validated to predict the individual speech-in-noise recognition performance of listeners with normal and impaired hearing with and without a given hearing-aid algorithm. FADE uses a simple automatic speech recognizer (ASR) to estimate the lowest achievable speech reception thresholds (SRTs) from simulated speech recognition experiments in an objective way, independent from any empirical reference data. Empirical data from the literature were used to evaluate the model in terms of predicted SRTs and benefits in SRT with the German matrix sentence recognition test when using eight single- and multichannel binaural noise-reduction algorithms. To allow individual predictions of SRTs in binaural conditions, the model was extended with a simple better ear approach and individualized by taking audiograms into account. In a realistic binaural cafeteria condition, FADE explained about 90% of the variance of the empirical SRTs for a group of normal-hearing listeners and predicted the corresponding benefits with a root-mean-square prediction error of 0.6 dB. This highlights the potential of the approach for the objective assessment of benefits in SRT without prior knowledge about the empirical data. The predictions for the group of listeners with impaired hearing explained 75% of the empirical variance, while the individual predictions explained less than 25%. Possibly, additional individual factors should be considered for more accurate predictions with impaired hearing. A competing talker condition clearly showed one limitation of current ASR technology, as the empirical performance with SRTs lower than −20 dB could not be predicted.

[1] B. Kollmeier,et al. Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition. , 2015, The Journal of the Acoustical Society of America.

[2] Rainer Martin,et al. Improved A Posteriori Speech Presence Probability Estimation Based on a Likelihood Ratio With Fixed Priors , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[3] Martin Cooke,et al. A glimpsing model of speech perception in noise. , 2006, The Journal of the Acoustical Society of America.

[4] Anna Warzybok,et al. Matrix sentence intelligibility prediction using an automatic speech recognition system , 2015, International journal of audiology.

[5] James M Kates,et al. Coherence and the speech intelligibility index. , 2004, The Journal of the Acoustical Society of America.

[6] Jesper Jensen,et al. An Algorithm for Predicting the Intelligibility of Speech Masked by Modulated Noise Maskers , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[7] Birger Kollmeier,et al. Functionality of hearing aids: state-of-the-art and future model-based solutions , 2018, International journal of audiology.

[8] Mitsunori Mizumachi,et al. WHAT IS SPEECH INTELLIGIBILITY? , 2017 .

[9] Birger Kollmeier,et al. Combining Binaural and Cortical Features for Robust Speech Recognition , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[10] G Keidser,et al. NAL-NL1 procedure for fitting nonlinear hearing aids: characteristics and comparisons with other procedures. , 2001, Journal of the American Academy of Audiology.

[11] Joerg Bitzer,et al. Post-Filtering Techniques , 2001, Microphone Arrays.

[12] James M. Kates,et al. The Hearing-Aid Speech Perception Index (HASPI) , 2014, Speech Commun..

[13] Torsten Dau,et al. A multi-resolution envelope-power based model for speech intelligibility. , 2013, The Journal of the Acoustical Society of America.

[14] Enrico Tronci. 1997 , 1997, Les 25 ans de l’OMC: Une rétrospective en photos.

[15] Wouter A Dreschler,et al. Modelling the speech reception threshold in non-stationary noise in hearing-impaired listeners as a function of level , 2010, International journal of audiology.

[16] Anna Warzybok,et al. Individual speech recognition in noise, the audiogram and more: Using automatic speech recognition (ASR) as a modelling tool , 2015 .

[17] Giso Grimm,et al. Increase and Subjective Evaluation of Feedback Stability in Hearing Aids by a Binaural Coherence-Based Noise Reduction Scheme , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[18] Anna Warzybok,et al. A simulation framework for auditory discrimination experiments: Revealing the importance of across-frequency processing in speech perception. , 2016, The Journal of the Acoustical Society of America.

[19] T. Brand,et al. Microscopic prediction of speech recognition for listeners with normal hearing in noise using an auditory model. , 2009, The Journal of the Acoustical Society of America.

[20] Marc René Schädler,et al. Robust automatic speech recognition and modeling of auditory discrimination experiments with auditory spectro-temporal features , 2016 .

[21] Volker Hohmann,et al. Comparing Binaural Pre-processing Strategies I : Instrumental Evaluation , 2015 .

[22] B Kollmeier,et al. Speech intelligibility prediction in hearing-impaired listeners based on a psychoacoustically motivated perception model. , 1996, The Journal of the Acoustical Society of America.

[23] Volker Hohmann,et al. Database of Multichannel In-Ear and Behind-the-Ear Head-Related and Binaural Room Impulse Responses , 2009, EURASIP J. Adv. Signal Process..

[24] L. J. Griffiths,et al. An alternative approach to linearly constrained adaptive beamforming , 1982 .

[25] T. Dau,et al. A quantitative model of the "effective" signal processing in the auditory system. II. Simulations and measurements. , 1996, The Journal of the Acoustical Society of America.

[26] Stephan D Ewert,et al. The role of short-time intensity and envelope power for speech intelligibility and psychoacoustic masking. , 2017, The Journal of the Acoustical Society of America.

[27] Yifan Gong,et al. Robust Automatic Speech Recognition , 2015 .

[28] Birger Kollmeier,et al. Speech Intelligibility Prediction in Hearing-Impaired Listeners for Steady and Fluctuating Noise , 2019, Modeling Sensorineural Hearing Loss.

[29] David Hülsmeier,et al. Microscopic Multilingual Matrix Test Predictions Using an ASR-Based Speech Recognition Model , 2016, INTERSPEECH.

[30] Stephen Wright,et al. An alternative approach. , 2010, Nursing standard (Royal College of Nursing (Great Britain) : 1987).

[31] Arne Leijon,et al. An information theoretic approach to predict speech intelligibility for listeners with normal and impaired hearing , 2007, INTERSPEECH.

[32] R. Plomp. Auditory handicap of hearing impairment and the limited benefit of hearing aids , 1977 .

[33] Stefano Cosentino,et al. Non-intrusive objective speech quality and intelligibility prediction for hearing instruments in complex listening environments , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[34] R Plomp,et al. Auditory handicap of hearing impairment and the limited benefit of hearing aids. , 1978, The Journal of the Acoustical Society of America.

[35] T Houtgast,et al. A physical method for measuring speech-transmission quality. , 1980, The Journal of the Acoustical Society of America.

[36] P. Peterson,et al. Intelligibility-weighted measures of speech-to-interference ratio and speech system performance. , 1993, The Journal of the Acoustical Society of America.

[37] Birger Kollmeier,et al. Revision, extension, and evaluation of a binaural speech intelligibility model. , 2010, The Journal of the Acoustical Society of America.

[38] B.D. Van Veen,et al. Beamforming: a versatile approach to spatial filtering , 1988, IEEE ASSP Magazine.

[39] Anna Warzybok,et al. Sentence Recognition Prediction for Hearing-impaired Listeners in Stationary and Fluctuation Noise With FADE , 2016, Trends in hearing.

[40] Jesper Jensen,et al. An evaluation of objective measures for intelligibility prediction of time-frequency weighted noisy speech. , 2011, The Journal of the Acoustical Society of America.

[41] Martin Dahlquist,et al. Standard Audiograms for the IEC 60118-15 Measurement Procedure , 2010, Trends in amplification.

[42] Giso Grimm,et al. The master hearing Aid : A PC-based platform for algorithm development and evaluation , 2006 .

[43] Volker Hohmann,et al. Comparing Binaural Pre-processing Strategies II , 2015, Trends in hearing.