Speech pause detection for noise spectrum estimation by tracking power envelope dynamics

A speech pause detection algorithm is an important and sensitive part of most single-microphone noise reduction schemes for enhancement of speech signals corrupted by additive noise as an estimate of the background noise is usually determined when speech is absent. An algorithm is proposed which detects speech pauses by adaptively tracking minima in a noisy signal's power envelope both for the broadband signal and for the high-pass and low-pass filtered signal. In poor signal-to-noise ratios (SNRs), the proposed algorithm maintains a low false-alarm rate in the detection of speech pauses while the standardized algorithm of ITU G.729 shows an increasing false-alarm rate in unfavorable situations. These characteristics are found with different types of noise and indicate that the proposed algorithm is better suited to be used for noise estimation in noise reduction algorithms, as speech deterioration may thus be kept at a low level. It is shown that in connection with the Ephraim-Malah (1984) noise reduction scheme, the speech pause detection performance can even be further increased by using the noise-reduced signal instead of the noisy signal as input for the speech pause decision unit.

[1]  Mark Marzinzik,et al.  Noise Reduction Schemes for Digital Hearing Aids and Their Use for the Hearing Impaired , 2001 .

[2]  Gerhard Doblinger,et al.  Computationally efficient speech enhancement by spectral minima tracking in subbands , 1995, EUROSPEECH.

[3]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[4]  Silvio Montrésor,et al.  Speech signal detection in noisy environement using a local entropic criterion , 1997, EUROSPEECH.

[5]  D. Paul The spectral envelope estimation vocoder , 1981 .

[6]  P. Kabal,et al.  Comparison of voice activity detection algorithms for wireless personal communications systems , 1997, CCECE '97. Canadian Conference on Electrical and Computer Engineering. Engineering Innovation: Voyage of Discovery. Conference Proceedings.

[7]  Thomas Wittkop,et al.  Two-channel noise reduction algorithms motivated by models of binaural interaction , 2001 .

[8]  Rainer Martin,et al.  An efficient algorithm to estimate the instantaneous SNR of speech signals , 1993, EUROSPEECH.

[9]  Masahide Mizushima,et al.  Environmental noise reduction based on speech/non-speech identification for hearing aids , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[11]  Fei Xie,et al.  A comparative study of speech detection methods , 1997, EUROSPEECH.

[12]  H. Sheikhzadeh,et al.  Real-time implementation of HMM-based MMSE algorithm for speech enhancement in hearing aid applications , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[13]  James P. Egan,et al.  Signal detection theory and ROC analysis , 1975 .

[14]  K. Srinivasan,et al.  Voice activity detection for cellular networks , 1993, Proceedings., IEEE Workshop on Speech Coding for Telecommunications,.

[15]  Pavel Sovka,et al.  The study of speech/pause detectors for speech enhancement methods , 1995, EUROSPEECH.

[16]  Gary H. Whipple,et al.  Model based speech pause detection , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  J. Proudfoot,et al.  Noise , 1931, The Indian medical gazette.

[18]  C Ludvigsen,et al.  The design and testing of a noise reduction algorithm based on spectral subtraction. , 1993, Scandinavian audiology. Supplementum.

[19]  Hans-Günter Hirsch,et al.  Noise estimation techniques for robust speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[20]  Christoph Draxler Introduction to the Verbmobil-PhonDat Database of Spoken German , 1995 .

[21]  Rafik A. Goubran,et al.  SNR estimation of speech signals using subbands and fourth-order statistics , 1999, IEEE Signal Processing Letters.

[22]  George S. Kang,et al.  Quality improvement of LPC-processed noisy speech by using spectral subtraction , 1989, IEEE Trans. Acoust. Speech Signal Process..

[23]  Rainer Martin,et al.  Spectral Subtraction Based on Minimum Statistics , 2001 .

[24]  George Carayannis,et al.  Higher order statistics based Gaussianity test applied to on-line speech processing , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.