Enhancement of speech using bark-scaled wavelet packet decomposition

In this paper, we propose a speech enhancement system, which integrates a bark-scaled wavelet packet decomposition (BS-WPD), a soft-decision gain modi(cid:12)cation and a \magnitude" decision-directed estimation technique. The BS-WPD provides an overcomplete auditory representation, having a higher frequency resolution than the critical band decomposition. Speech is estimated by Wiener (cid:12)ltering in the wavelet packet domain, mod-i(cid:12)ed by the signal presence probability. We introduce a \magnitude" decision-directed estimator for the variance of speech, which is closely related to the decision-directed estimator of Ephraim and Malah. This estimator achieves, in the established process, a better trade-o(cid:11) between noise reduction and signal distortion. The proposed enhancement algorithm is tested with various noise types, and compared to a conventional log-spectral amplitude estimator. We show that noise can be further suppressed, while preserving its natural structure and the intelligibility and quality of the speech components.

[1]  Israel Cohen,et al.  On speech enhancement under signal presence uncertainty , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[2]  K F E2K,et al.  Spectral Enhancement By Tracking Speech Presence Probability In Subbands , 2001 .

[3]  Andrzej Drygajlo,et al.  Perceptual speech coding and enhancement using frame-synchronized fast wavelet packet transform algorithms , 1999, IEEE Trans. Signal Process..

[4]  Nathalie Virag,et al.  Single channel speech enhancement based on masking properties of the human auditory system , 1999, IEEE Trans. Speech Audio Process..

[5]  Eliathamby Ambikairajah,et al.  Wavelet transform-based speech enhancement , 1998, ICSLP.

[6]  Sridha Sridharan,et al.  Speech enhancement using critical band spectral subtraction , 1998, ICSLP.

[7]  Andreas Engelsberg,et al.  Comparison of a discrete wavelet transformation and a nonuniform polyphase filterbank applied to spectral-subtraction speech enhancement , 1998, Signal Process..

[8]  Henning Puder,et al.  Speech enhancement for mobile telephony based on non-uniformly spaced frequency resolution , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[9]  Soo Ngee Koh,et al.  Wavelet for speech denoising , 1997, TENCON '97 Brisbane - Australia. Proceedings of IEEE TENCON '97. IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications (Cat. No.97CH36162).

[10]  M. Victor Wickerhauser,et al.  Adapted wavelet analysis from theory to software , 1994 .

[11]  H. Traunmüller Analytical expressions for the tonotopic sensory scale , 1990 .

[12]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[13]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .