论文信息 - Comparison of Speech Enhancement Algorithms

Comparison of Speech Enhancement Algorithms

The simplest and very familiar method to take out stationary background noise is spectral subtraction. In this algorithm, a spectral noise bias is calculated from segments of speech inactivity and is subtracted from noisy speech spectral amplitude, retaining the phase as it is. Secondary procedures follow spectral subtraction to reduce the unpleasant auditory effects due to spectral error. The drawback of spectral subtraction is that it is applicable to speech corrupted by stationary noise. The research in this topic aims at studying the spectral subtraction & Wiener filter technique when the speech is degraded by non-stationary noise. We have studied both algorithms assuming stationary noise scenario. In this we want to study these two algorithms in the context of non-stationary noise. Next, decision directed (DD) approach, is used to estimate the time varying noise spectrum which resulted in better performance in terms of intelligibility and reduced musical noise. However, the a priori SNR estimator of the current frame relies on the estimated speech spectrum from the earlier frame. The undesirable consequence is that the gain function doesn’t match the current frame, resulting in a bias which causes annoying echoing effect. A method called Two-step noise reduction (TSNR) algorithm was used to solve the problem which tracks instantaneously the non-stationarity of the signal but, not by losing the advantage of the DD approach. The a priori SNR estimation was modified and made better by an additional step for removing the bias, thus eliminating reverberation effect. The output obtained even with TSNR still suffers from harmonic distortions which are inherent to all short time noise suppression techniques, the main reason being the inaccuracy in estimating PSD in single channel systems. To outdo this problem, a concept called, Harmonic Regeneration Noise Reduction (HRNR) is used wherein a non-linearity is made use of for regenerating the distorted/missing harmonics. All the above discussed algorithms have been implemented and their performance evaluated using both subjective and objective criteria. The performance is significantly improved by using HRNR combined with TSNR, as compared to TSNR, DD alone, as HRNR ensures restoration of harmonics. The spectral subtraction performance stands much below the above discussed methods for obvious reasons.

[1] Pascal Scalart,et al. A two-step noise reduction technique , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2] David Malah,et al. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[3] Olivier Cappé,et al. Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor , 1994, IEEE Trans. Speech Audio Process..

[4] Robert F. Kubichek,et al. Standards and technology issues in objective voice quality assessment , 1991, Digit. Signal Process..

[5] Rainer Martin,et al. Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[6] Pascal Scalart,et al. Speech enhancement using harmonic regeneration , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[7] Pascal Scalart,et al. Speech enhancement based on a priori signal to noise estimation , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[8] S. Boll,et al. Suppression of acoustic noise in speech using spectral subtraction , 1979 .