Mutual dependence of the octave-band weights in predicting speech intelligibility

Current objective measures for predicting the intelligibility of speech by an index assume that this can be obtained by simple addition of the contributions of individual frequency bands. The Articulation Index (AI, and the related Speech Intelligibility Index), and the Speech Transmission Index (STI) are based on this assumption. There is evidence that the underlying assumption of additive (mutually independent) contributions from a number of frequency bands is not optimal and may lead to erroneous prediction of the intelligibility for conditions with a limited or with a discontinuous frequency transfer. Depending on the frequency band considered, errors between 0.1 and 0.25 STI may occur. An experiment was designed to estimate the contribution of individual frequency bands, and their mutual dependence. For this purpose the speech spectrum was subdivided into seven octave bands with center frequencies ranging from 125 Hz to 8 kHz. For 26 different combinations of three or more octave bands the CVC-word score (Consonant-Vowel-Consonant, nonsense words) was determined at three signal-to-noise ratios. It was found that successful prediction of the scores required a revised model which accounts for mutual dependency between adjacent octave bands. In this model a so-called redundancy correction is introduced. Consequences for the existing objective measures are discussed. The presented results are included in the revised IEC standard (IEC 60268-part 16, 1998).

[1]  Harvey Fletcher,et al.  Articulation testing methods , 1929 .

[2]  T. Houtgast,et al.  A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria , 1985 .

[3]  Louis D. Braida,et al.  Evaluating the articulation index for auditory-visual input. , 1987, The Journal of the Acoustical Society of America.

[4]  Herman J. M. Steeneken Quality Evaluation of Speech Processing Systems , 1992 .

[5]  Herman J. M. Steeneken,et al.  On the mutual dependency of octave-band-specific contributions to speech intelligibility , 1991, EUROSPEECH.

[6]  Tammo Houtgast,et al.  A physical approach to speech quality assessment: correlation patterns in the speech spectrogram , 1991, EUROSPEECH.

[7]  Irwin Pollack,et al.  Effects of High Pass and Low Pass Filtering on the Intelligibility of Speech in Noise , 1948 .

[8]  Herman J. M. Steeneken,et al.  Digital simulation of speech transmission channels , 1991 .

[9]  C V Pavlovic,et al.  Derivation of primary parameters and procedures for use in speech intelligibility predictions. , 1987, The Journal of the Acoustical Society of America.

[10]  Karl D. Kryter Speech Bandwidth Compression through Spectrum Selection , 1960 .

[11]  H.J.M. Steeneken,et al.  On measuring and predicting speech intelligibility , 1992 .

[12]  C V Pavlovic,et al.  A frequency importance function for continuous discourse. , 1987, The Journal of the Acoustical Society of America.

[13]  T Houtgast,et al.  A physical method for measuring speech-transmission quality. , 1980, The Journal of the Acoustical Society of America.

[14]  K. D. Kryter Methods for the Calculation and Use of the Articulation Index , 1962 .

[15]  H. Fletcher,et al.  The Perception of Speech and Its Relation to Telephony , 1950 .

[16]  J. C. Steinberg,et al.  Factors Governing the Intelligibility of Speech Sounds , 1945 .

[17]  H. K. Dunn,et al.  Statistical Measurements on Conversational Speech , 1940 .

[18]  Herman J. M. Steeneken,et al.  Speech data-base for intelligibility and speech quality measurements , 1990 .

[19]  T. Houtgast,et al.  The Modulation Transfer Function in Room Acoustics as a Predictor of Speech Intelligibility , 1973 .