This paper studies the problem of noise reduction in the short-time Fourier transform (STFT) domain. Traditionally, the STFT coefficients in different frequency bands are assumed to be independent. This assumption holds when the signals are stationary and the fast Fourier transform(FFT) length is sufficiently large. In practice, however, speech is nonstationary and also the FFT length cannot be very large due to practical reasons. So, there always exists some correlation between STFT coefficients from neighboring frequency bands. An important question then arises: how the interband correlation can be used to optimize noise reduction performance? This paper addresses this issue. We discuss two solutions in the framework of the bifrequency spectrum. One considers the cross-correlation between all the frequency bands and the other takes into account only the cross-correlation between neighboring bands. While the former is optimal from a theoretical perspective, the latter is more practical as it is more immune to the error in correlation matrix estimation.
[1]
Emanuel A. P. Habets,et al.
Speech Enhancement in the STFT Domain
,
2011,
Springer Briefs in Electrical and Computer Engineering.
[2]
Neil L. Gerr,et al.
The Generalised Spectrum and Spectral Coherence of a Harmonizable Time Series
,
1994
.
[3]
Eric Plourde,et al.
Multidimensional STSA Estimators for Speech Enhancement With Correlated Spectral Components
,
2011,
IEEE Transactions on Signal Processing.
[4]
Søren Vang Andersen,et al.
A Block-Based Linear MMSE Noise Reduction with a High Temporal Resolution Modeling of the Speech Excitation
,
2005,
EURASIP J. Adv. Signal Process..
[5]
Antonio Napolitano,et al.
Uncertainty in measurements on spectrally correlated stochastic processes
,
2003,
IEEE Trans. Inf. Theory.
[6]
Jacob Benesty,et al.
Noise Reduction in Speech Processing
,
2009
.