Speech enhancement using fourth-order cumulants and time-domain optimal filters

A new method for speech enhancement based on optimal filtering, subbands, and higher-order cumulants is proposed in this paper. The key idea is to use the 4th cumulant to estimate the parameters required for the enhancement filters: It is shown that the kurtosis of noisy speech may be used to estimate the SNR and the probability of speech presence when speech is divided in narrow bands and modeled as a sinusoidal signal. The resulting algorithm is tested in typical mobile noise conditions and proves effective under such types as street, office and fan noises. Compared to the TIA IS-127 standard for noise reduction, the proposed algorithm is better at preserving the harmonic structure of the speech and results in overall more noise reduction in Gaussian-like conditions. However, this comes at the cost of slightly more noise artifact, mostly at very low SNR and non-Gaussian conditions.

[1]  Robert Hoeldrich,et al.  Non-Linear Spectral Subtraction with Combined Smoothing Strategies for Broadband Noise Reduction , 1997 .

[2]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[3]  B C Moore,et al.  Auditory filter shapes derived in simultaneous and forward masking. , 1981, The Journal of the Acoustical Society of America.

[4]  Leon Garcia,et al.  Probability and Random Processes for Electrical Engineering , 1993 .

[5]  S. Mahmoud,et al.  The third-order cumulant of speech signals with application to reliable pitch estimation , 1998, Ninth IEEE Signal Processing Workshop on Statistical Signal and Array Processing (Cat. No.98TH8381).

[6]  Jerry M. Mendel,et al.  Cumulant-based approach to harmonic retrieval and related problems , 1991, IEEE Trans. Signal Process..

[7]  Pascal Scalart,et al.  Speech enhancement based on a priori signal to noise estimation , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[8]  P. P. Vaidyanathan,et al.  Cosine-modulated FIR filter banks satisfying perfect reconstruction , 1992, IEEE Trans. Signal Process..

[9]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .