Robustness of the Hearing Aid Speech Quality Index (HASQI)

Objective measures of speech quality have been the subject of significant prior work, particularly in the areas of speech codecs and communication channels for normal-hearing listeners. One of the primary concerns of researchers in this area is how these metrics generalize to datasets or listener studies which are “unknown” to the measures. Another growing concern is how these metrics perform for the hearing-impaired community. Researchers working with the this community need to be able to predict how hearing-impaired listeners will perceive the quality of speech, as well as how they will perceive the quality of speech processed specifically by hearing aids. A relatively recent metric, the Hearing Aid Speech Quality Index (HASQI), is a model-based objective measure of quality developed in the context of hearing aids for normal-hearing and hearing-impaired listeners (Kates & Arehart, Journal of the Audio Engineering Society, 2010). As such, HASQI makes substantial progress on some of the generalization issues. However, HASQI has not been tested thus far on any datasets other than the one on which it was trained. The objective of this study is to demonstrate the robustness of HASQI in predicting subjective quality. We use an “unknown” dataset of noisy speech processed by noise suppression algorithms, along with a corresponding set of subjective quality scores from normal-hearing listeners, to demonstrate HASQI's prediction performance. Furthermore, we compare HASQI's performance with that of several other objective measures in order to provide a point of reference.

[1]  Andries P. Hekstra,et al.  Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[2]  Yi Hu,et al.  Subjective comparison and evaluation of speech enhancement algorithms , 2007, Speech Commun..

[3]  B. Moore The Role of Temporal Fine Structure Processing in Pitch Perception, Masking, and Speech Perception for Normal-Hearing and Hearing-Impaired People , 2008, Journal of the Association for Research in Otolaryngology.

[4]  H. Hassanein,et al.  On the use of discrete cosine transform in cepstral analysis , 1984 .

[5]  Muhammad S A Zilany,et al.  Representation of the vowel /epsilon/ in normal and impaired auditory nerve fibers: model predictions of responses in cats. , 2007, The Journal of the Acoustical Society of America.

[6]  Philipos C. Loizou Speech Enhancement (Signal Processing and Communications) , 2007 .

[7]  James M. Kates,et al.  The Hearing-Aid Speech Quality Index (HASQI) , 2010 .

[8]  Ian C. Bruce,et al.  Representation of the vowel /ε/ in normal and impaired auditory nerve fibers: Model predictions of responses in cats , 2007 .

[9]  Yi Hu,et al.  Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  John G. Beerends,et al.  A Perceptual Audio Quality Measure Based on a Psychoacoustic Sound Representation , 1992 .

[11]  James M Kates,et al.  Effects of noise and distortion on speech quality judgments in normal-hearing and hearing-impaired listeners. , 2007, The Journal of the Acoustical Society of America.

[12]  A. Berger FUNDAMENTALS OF BIOSTATISTICS , 1969 .

[13]  Schuyler Quackenbush,et al.  Objective measures of speech quality , 1995 .

[14]  Brian C. J. Moore,et al.  Development and Validation of a Method for Predicting the Perceived Naturalness of Sounds Subjected to Spectral Distortion , 2004 .