Use of the Articulation Index to Design a Wavelet Packet-Based Method for Improving Speech Intelligibility

Speech transients are important cues for identifying and discriminating speech sounds, and several studies have suggested that selective amplification of these transients can improve the intelligibility of speech in noise. This paper describes an improved version of a wavelet-based method for extracting transient speech that we described in [9] and the use of the articulation index to select optimal parameters for the method. The new method combines subband decomposition by wavelet packets and transition rate characterization based on the first derivative of short-time energy. The method also incorporates a threshold which, when varied, controls the amount of quasi-steady-state activity that is included in the transient speech signal. The speech modification scheme is optimized and intelligibility improvement is estimated using the articulation index.

[1]  Amro El-Jaroudi,et al.  Speech Enhancement Using Transient Speech Components , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[2]  Sungyub Yoo,et al.  Relative energy and intelligibility of transient speech components , 2004, 2004 12th European Signal Processing Conference.

[3]  S. Gordon-Salant Recognition of natural and time/intensity altered CVs by young and elderly subjects with normal hearing. , 1986, The Journal of the Acoustical Society of America.

[4]  Douglas D. O'Shaughnessy,et al.  Speech enhancement based conceptually on auditory evidence , 1991, IEEE Trans. Signal Process..

[5]  Jont B. Allen Consonant recognition and the articulation index. , 2005, Journal of the Acoustical Society of America.

[6]  A. Liberman,et al.  The role of consonant-vowel transitions in the perception of the stop and nasal consonants. , 1954 .

[7]  K. D. Kryter Methods for the Calculation and Use of the Articulation Index , 1962 .

[8]  Andrew C. Simpson,et al.  The effect of cue-enhancement on the intelligibility of nonsense word and sentence materials presented in noise , 1998, Speech Commun..

[9]  K. D. Kryter PROPOSED METHODS FOR THE CALCULATION OF THE ARTICULATION INDEX , 1961 .

[10]  Bruno Torrésani,et al.  Hybrid representations for audiophonic signal encoding , 2002, Signal Process..

[11]  Ching-Chung Li,et al.  Identification of Speech Transients Using Variable Frame Rate Analysis and Wavelet Packets , 2006, 2006 International Conference of the IEEE Engineering in Medicine and Biology Society.

[12]  R. Niederjohn,et al.  The enhancement of speech intelligibility in high noise levels by high-pass filtering followed by rapid amplitude compression , 1976 .

[13]  P. Mermelstein,et al.  Speech sounds and features , 1975, Proceedings of the IEEE.

[14]  John G. Harris,et al.  Applied principles of clear and Lombard speech for automated intelligibility enhancement in noisy environments , 2006, Speech Commun..

[15]  H. Fletcher,et al.  The Perception of Speech and Its Relation to Telephony , 1950 .