A Statistical Approach for Voiced Speech Detection

Detection of Voice in speech signal is a challenging problem in developing high-performance systems used in noisy environments. In this paper, we present an efficient algorithm for robust voiced speech detection and for the application to variable-rate speech coding. The key idea of the algorithm is considering speech energy and zero crossings rate (ZCR) information simultaneously when processing speech signals and finding the end point of the signal. Next to it a decision rule and a background noise statistics estimator, by applying a statistical model. A robust decision rule is derived from the generalized likelihood ratio test (LRT) by assuming that the noise statistics are known a priori. The algorithm is most efficient for the time-varying noise. According to our simulation results, the proposed algorithm shows significantly better performance in low signal-to-noise ratio and in noisy environments.

[1]  Ronald W. Schafer,et al.  Digital Processing of Speech Signals , 1978 .

[2]  Sang-Ick Kang,et al.  Discriminative Weight Training for a Statistical Model-Based Voice Activity Detection , 2008, IEEE Signal Processing Letters.

[3]  Lawrence R. Rabiner,et al.  A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition , 1976 .

[4]  Bayya Yegnanarayana,et al.  Voiced/Nonvoiced Detection Based on Robustness of Voiced Epochs , 2010, IEEE Signal Processing Letters.

[5]  K. Srinivasan,et al.  Voice activity detection for cellular networks , 1993, Proceedings., IEEE Workshop on Speech Coding for Telecommunications,.

[6]  Donald G. Childers,et al.  Silent and voiced/unvoiced/mixed excitation (four-way) classification of speech , 1989, IEEE Trans. Acoust. Speech Signal Process..

[7]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[8]  L. Siegel,et al.  Voiced/Unvoiced/Mixed excitation classification of speech , 1982 .

[9]  John Mason,et al.  Robust voice activity detection using cepstral features , 1993, Proceedings of TENCON '93. IEEE Region 10 International Conference on Computers, Communications and Automation.

[10]  Wonyong Sung,et al.  A voice activity detector employing soft decision based noise spectrum adaptation , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[11]  Lawrence R. Rabiner,et al.  Voiced-unvoiced-silence detection using the Itakura LPC distance measure , 1977 .