Voice activity detection based on short-time energy and noise spectrum adaptation

On the basis of the short-time energy of speech signals and the efficient method of noise statistics adaptation estimation proposed by Sohn et al.(1998), a new highly robust voice activity detection (VAD) rule for any kind of environmental noise is proposed in this paper. The accurate recognition rate of the new method is about five percent higher than that of Sohn's method on average, and also has the same merit of tracking the noise spectrum properly as in Sohn's method. Simulation experiments show that the new method is an efficient and robust voice activity detector.

[1]  Allen Gersho,et al.  An overview of variable rate speech coding for cellular networks , 1992, 1992 IEEE International Conference on Selected Topics in Wireless Communications.

[2]  Wonyong Sung,et al.  A voice activity detector employing soft decision based noise spectrum adaptation , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[3]  E. Shlomot,et al.  ITU-T Recommendation G.729 Annex B: a silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications , 1997, IEEE Commun. Mag..

[4]  Anthony G. Constantinides,et al.  Residual echo signal in critically sampled subband acoustic echo cancellers based on IIR and FIR filter banks , 1997, IEEE Trans. Signal Process..

[5]  Theodore S. Rappaport,et al.  Wireless communications - principles and practice , 1996 .

[6]  Paul T. Brady,et al.  A technique for investigating on-off patterns of speech , 1965 .

[7]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.