A voice activity detector based on cepstral analysis

This paper proposes a new approach to speech end-point detection based on cepstral analysis. The algorithm is based on explicit (static) modelling of speech and non-speech, and decisions are made on each incoming (overlapped) cepstral frame, according to model similarity scores. The cepstral analysis provides excellent level-independence, meaning that parameter adjustment , decision thresholds etc, are unnecessary. A high degree of robustness to additive noise is demonstrated, even though the models are static. Accurate end-points are recovered with SNR levels of 0dB.