Audio segmentation, classification and clustering in a broadcast news task

The paper describes our work on the development of an audio segmentation, classification and clustering system applied to a broadcast news task for the European Portuguese language. We developed a new algorithm for audio segmentation that is both accurate and uses fewer computational resources than other approaches. Our speaker clustering module uses a modified BIC (Bayesian information criterion) algorithm which performs substantially better than the standard symmetric Kullback-Liebler, KL2, and is much faster than the full BIC. Finally, we developed a scheme for tagging certain speaker clusters (anchors) using trained cluster models. A series of tests were conducted showing the advantage of the new algorithms. This system is part of a prototype system that is daily processing the main news show of the national Portuguese broadcaster.