Subband wavelet signal denoising for voice activity detection

In this paper we propose a method for voice activity detection (VAD) in a speech signal recorded in the presence of noise. The so-called endpoint detection (EPD), i.e., detection of voice activity (speech) boundaries is very difficult if the signal is acquired in noisy environments. The proposed VAD method uses an additional stage of wavelet subband denoising. We compared this approach with other standard methods i.e.: zero-crossing rate and spectral entropy analysis. Additionally we present in this paper our basic results illustrating the main aim of this contribution, consisting in application of intelligent denoising strategies to various VAD algorithms.

[1]  Sadegh Rezaei,et al.  Change Point Detection in GARCH Models for Voice Activity Detection , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  A. Dabrowski,et al.  Detection Of Endpoints Of Isolated Words Using Slope Transformation , 2006, Proceedings of the International Conference Mixed Design of Integrated Circuits and System, 2006. MIXDES 2006..

[3]  E. Shlomot,et al.  ITU-T Recommendation G.729 Annex B: a silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications , 1997, IEEE Commun. Mag..

[4]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[5]  K.-C. Wang,et al.  Robust endpoint detection algorithm based on the adaptive band-partitioning spectral entropy in adverse environments , 2005, IEEE Transactions on Speech and Audio Processing.

[6]  Lawrence R. Rabiner,et al.  An algorithm for determining the endpoints of isolated utterances , 1975, Bell Syst. Tech. J..

[7]  L. Rabiner,et al.  An algorithm for determining the endpoints of isolated utterances , 1974, The Bell System Technical Journal.