Use of Pitch Continuity for Robust Speech Activity Detection

Speech activity detection (SAD) is an important component for various speech processing applications and has been researched extensively recently. The pitch continuity, a significant characteristic of speech, however, has not successfully played a role in existing SAD methods. In this work, we propose a novel way to integrate the pitch continuity with pitch-related features. Practice is carried out through the Combo-SAD approach: We examine three consecutive frames and assume that they all have the same pitch as the center frame due to pitch continuity. Corresponding feature values are recomputed at the adjusted pitch location and then used in the final expression. The new combo feature is evaluated with various types of additive noise at different signal-to-noise ratios (SNR). The results show that the new feature leads to better SAD performance (with an up to 39.3% relative improvement on miss rate compared to Combo-SAD). We also introduce a novel variant of the underlying autocorrelation function and illustrate how it can improve the accuracy of pitch detection.