Speech enhancement: new approaches to soft decision

In this paper, we propose new approaches to speech enhancement based on soft decision. In order to enhance the statistical reliability in estimating speech activity, we introduce the concept of a global speech absence probability (GSAP). First, we compute the conventional speech absence probability (SAP) and then modify it according to the newly proposed GSAP. The modification is made in such a way that the SAP has the same value of GSAP in the case of speech absence while it is maintained to its original value when the speech is present. Moreover, for improving the performance of the SAP’s at voice tails (transition periods from speech to silence), we revise the SAP’s using a hang-over scheme based on the hidden Markov model (HMM). In addition, we suggest a robust noise update algorithm in which the noise power is estimated not only in the periods of speech absence but also during speech activity based on soft decision. Also, for improving the SAP determination and noise update routines, we present a new signal to noise ratio (SNR) concept which is called the predicted SNR in this paper. Moreover, we demonstrate that the discrete cosine transform (DCT) enhances the accuracy of the SAP estimation. A number of tests show that the proposed method which is called the speech enhancement based on soft decision (SESD) algorithm yields better performance than the conventional approaches. key words: speech enhancement, global soft decision, hang-over, predicted SNR, DCT

[1]  David Malah,et al.  Tracking speech-presence uncertainty to improve speech enhancement in non-stationary noise environments , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[2]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[3]  Jin Yang Frequency domain noise suppression approaches in mobile telephone systems , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  R. McAulay,et al.  Speech enhancement using a soft-decision noise suppression filter , 1980 .

[5]  Pascal Scalart,et al.  Speech enhancement based on a priori signal to noise estimation , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[6]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[7]  Soo Ngee Koh,et al.  Noisy speech enhancement using discrete cosine transform , 1998, Speech Commun..

[8]  Olivier Cappé,et al.  Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor , 1994, IEEE Trans. Speech Audio Process..

[9]  K. Srinivasan,et al.  Voice activity detection for cellular networks , 1993, Proceedings., IEEE Workshop on Speech Coding for Telecommunications,.

[10]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[11]  Joseph Sylvester Chang,et al.  A parametric formulation of the generalized spectral subtraction method , 1998, IEEE Trans. Speech Audio Process..

[12]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .