SYSTEMS FOR ROBUST SPEECH ACTIVITY DETECTION AND THEIR RESULTS WITH THE RT05 AND RT06 EVALUATION TESTS

Robust Speech Activity Detection (SAD) systems are required in smart-room environments due to the presence of noises and reverberation. In this work, a previous SAD system, based on LDA-extracted features and a decision tree classifier, has been modified in terms of both feature extraction and classification to significantly improve its performance. New features based on the low- and highfrequency energy dynamics, and classifiers based on SVM and GMM have been investigated. In particular, a specific training process has been developed for the SVM case to cope with the problems of that classifier in our application. The resulting SAD systems have been trained with a subset of the SPEECON database. Tested in realistic conditions with the meeting databases from the NIST RT05 and RT06 evaluations, they have shown large improvements in speech detection performance.