论文信息 - DHASP: Differentiable Hearing Aid Speech Processing

DHASP: Differentiable Hearing Aid Speech Processing

Hearing aids are expected to improve speech intelligibility for listeners with hearing impairment. An appropriate amplification fitting tuned for the listener’s hearing disability is critical for good performance. The developments of most prescriptive fittings are based on data collected in subjective listening experiments, which are usually expensive and time-consuming. In this paper, we explore an alternative approach to finding the optimal fitting by introducing a hearing aid speech processing framework, in which the fitting is optimised in an automated way using an intelligibility objective function based on the HASPI physiological auditory model. The framework is fully differentiable, thus can employ the back-propagation algorithm for efficient, data-driven optimisation. Our initial objective experiments show promising results for noise-free speech amplification, where the automatically optimised processors outperform one of the well recognised hearing aid prescriptions.

[1] M. Sachs,et al. An auditory-periphery model of the effects of acoustic trauma on auditory nerve responses. , 2003, The Journal of the Acoustical Society of America.

[2] Pamela E Souza,et al. Exploring the limits of frequency lowering. , 2013, Journal of speech, language, and hearing research : JSLHR.

[3] Martin Cooke,et al. Modelling auditory processing and organisation , 1993, Distinguished dissertations in computer science.

[4] James M. Kates. An auditory model for intelligibility and quality predictions , 2013 .

[5] H. Dillon,et al. The National Acoustic Laboratories' (NAL) New Procedure for Selecting the Gain and Frequency Response of a Hearing Aid , 1986, Ear and hearing.

[6] D. D. Greenwood. A cochlear frequency-position function for several species--29 years later. , 1990, The Journal of the Acoustical Society of America.

[7] B C Moore,et al. Use of a loudness model for hearing aid fitting: III. A general method for deriving initial fittings for hearing aids with multi-channel compression. , 1999, British journal of audiology.

[8] B C Moore,et al. Inter-relationship between different psychoacoustic measures assumed to be related to the cochlear active mechanism. , 1999, The Journal of the Acoustical Society of America.

[9] Brian C J Moore,et al. Development of a new method for deriving initial fittings for hearing aids with multi-channel compression: CAMEQ2-HF , 2010, International journal of audiology.

[10] Martin Dahlquist,et al. Standard Audiograms for the IEC 60118-15 Measurement Procedure , 2010, Trends in amplification.

[11] B. Moore,et al. Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. , 1983, The Journal of the Acoustical Society of America.

[12] Blake S Wilson,et al. Global hearing health care: new findings and perspectives , 2017, The Lancet.

[13] Anna Warzybok,et al. Objective Prediction of Hearing Aid Benefit Across Listener Groups Using Machine Learning: Speech Recognition Performance With Binaural Noise-Reduction Algorithms , 2018, Trends in hearing.

[14] S. Zahorian,et al. Spectral-shape features versus formants as acoustic correlates for vowels. , 1993, The Journal of the Acoustical Society of America.

[15] Stefano Cosentino,et al. Non-intrusive objective speech quality and intelligibility prediction for hearing instruments in complex listening environments , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[16] James M Kates,et al. Coherence and the speech intelligibility index. , 2004, The Journal of the Acoustical Society of America.

[17] Ian C. Bruce,et al. Auditory nerve model for predicting performance limits of normal and impaired listeners , 2001 .

[18] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19] S. Zahorian. Principal‐Components Analysis for Low Redundancy Encoding of Speech Spectra , 1979 .

[20] Jonathan G. Fiscus,et al. DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .

[21] G Keidser,et al. NAL-NL1 procedure for fitting nonlinear hearing aids: characteristics and comparisons with other procedures. , 2001, Journal of the American Academy of Audiology.

[22] S. Zahorian,et al. Dynamic spectral shape features as acoustic correlates for initial stop consonants , 1991 .

[23] G. Keidser,et al. The NAL-NL2 Prescription Procedure , 2011, Audiology research.

[24] Thomas Lunner,et al. Relationship between distortion and working memory for digital noise-reduction processing in hearing aids , 2013 .

[25] James M. Kates,et al. The Hearing-Aid Speech Perception Index (HASPI) , 2014, Speech Commun..

[26] Steve Renals,et al. On Learning Interpretable CNNs with Parametric Modulated Kernel-Based Filters , 2019, INTERSPEECH.