Tracking speech-presence uncertainty to improve speech enhancement in non-stationary noise environments

Speech enhancement algorithms which are based on estimating the short-time spectral amplitude of the clean speech have better performance when a soft-decision gain modification, depending on the a priori probability of speech absence, is used. In reported works a fixed probability, q, is assumed. Since speech is non-stationary and may not be present in every frequency bin when voiced, we propose a method for estimating distinct values of q for different bins which are tracked in time. The estimation is based on a decision-theoretic approach for setting a threshold in each bin followed by short-time averaging. The estimated q's are used to control both the gain and the update of the estimated noise spectrum during speech presence in a modified MMSE log-spectral amplitude estimator. Subjective tests resulted in higher scores than for the IS-127 standard enhancement algorithm, when pre-processing noisy speech for a coding application.

[1]  Gerald S. Rogers,et al.  Mathematical Statistics: A Decision Theoretic Approach , 1967 .

[2]  A.V. Oppenheim,et al.  Enhancement and bandwidth compression of noisy speech , 1979, Proceedings of the IEEE.

[3]  R. McAulay,et al.  Speech enhancement using a soft-decision noise suppression filter , 1980 .

[4]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[5]  Jin Yang Frequency domain noise suppression approaches in mobile telephone systems , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Olivier Cappé,et al.  Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor , 1994, IEEE Trans. Speech Audio Process..

[7]  Pascal Scalart,et al.  Speech enhancement based on a priori signal to noise estimation , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[8]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .