Classical speech enhancement algorithms often require a good estimation of the short-time power spectrum using, for instance, the periodogram methods. However, it is well known that traditional periodogram methods are prone to induce large variance, hence produces the "musical noise" after enhancement. To alleviate this problem, multitaper spectrum (MTS) estimators with wavelet denoising were proposed. In this paper, we investigate the properties of the MTS of noisy speech signals. We find that, in the log MTS domain, the variance of noise varies according to the magnitude of the underlying speech spectrum. It implies that when applying wavelet denoising to the log MTS, the constant threshold used in the traditional methods is not appropriate. Based on this observation, we further develop a wavelet denoising method with adaptive threshold for estimating power spectrum using multitaper. Simulation results show that the spectrum estimated using the proposed method is consistently more accurate than the traditional uniform thresholding methods. Hence, it further improves the current speech enhancement algorithms using the MTS approaches.
[1]
David L. Donoho,et al.
De-noising by soft-thresholding
,
1995,
IEEE Trans. Inf. Theory.
[2]
Donald B. Percival,et al.
Spectrum estimation by wavelet thresholding of multitaper estimators
,
1998,
IEEE Trans. Signal Process..
[3]
Alberto Contreras-Cristán,et al.
Multitaper power spectrum estimation and thresholding: wavelet packets versus wavelets
,
2002,
IEEE Trans. Signal Process..
[4]
Yi Hu,et al.
Speech enhancement based on wavelet thresholding the multitaper spectrum
,
2004,
IEEE Transactions on Speech and Audio Processing.
[5]
Donald B. Percival,et al.
Spectral Analysis for Physical Applications
,
1993
.
[6]
Pierre Moulin.
Wavelet thresholding techniques for power spectrum estimation
,
1994,
IEEE Trans. Signal Process..
[7]
I. Daubechies.
Ten Lectures on Wavelets
,
1992
.