Voice activity detection using haircell model in noisy environment

The major function of voice activity detection (VAD) is to separate speech signals into voice and non-voice components. It is usually used in speech communication systems, such as speech recognition, hands-free telephony, audio conferencing and noise cancellation. This paper presents a new method for VAD, which is based on the haircell model (HCM). In the HCM, we use ordinary differential equations (2.1) to simulate the inner ear haircell of the human hearing system. The performance of the method is compared to the ITU-T G.729B VAD (A. Benyassine et al., IEEE Commun. Mag., pp. 64-72, Sept. 1997). The results show that the proposed method has performance better than G.729B.

[1]  Harry Wechsler,et al.  Detection of human speech in structured noise , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Rafik A. Goubran,et al.  Robust voice activity detection using higher-order statistics in the LPC residual domain , 2001, IEEE Trans. Speech Audio Process..

[3]  R. Tucker,et al.  Voice activity detection using a periodicity measure , 1992 .

[4]  E. Shlomot,et al.  ITU-T Recommendation G.729 Annex B: a silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications , 1997, IEEE Commun. Mag..

[5]  Lawrence R. Rabiner,et al.  Voiced-unvoiced-silence detection using the Itakura LPC distance measure , 1977 .

[6]  John Mason,et al.  Robust voice activity detection using cepstral features , 1993, Proceedings of TENCON '93. IEEE Region 10 International Conference on Computers, Communications and Automation.