Speaker Tracking by Anchor Models Using Speaker Segment Cluster Information

In this paper, we present a speaker tracking system entirely based on anchor models approach. The aim of this article is to evaluate if the probabilistic anchor models approach, which models a speaker by a normal distribution in the anchor models space, gives good performances in speaker tracking and also to investigate how speaker segment cluster information can improve speaker tracking performances. Evaluation is done on the audio database of the ESTER evaluation campaign for the rich transcription of French broadcast news. Results show that deterministic metrics on anchor models are suitable for segmentation and clustering tasks, whereas the probabilistic approach on anchor models gives interesting results for speaker-tracking. It is also observed that tracking performances are improved when all segments of a cluster are pooled together prior to the classification process. This improvement manifests itself as an improved recall rate on short segments

[1]  Christian Wellekens,et al.  DISTBIC: A speaker-based segmentation for audio data indexing , 2000, Speech Commun..

[2]  Delphine Charlet,et al.  Probabilistic anchor models approach for speaker verification , 2005, INTERSPEECH.

[3]  Sridha Sridharan,et al.  Feature warping for robust speaker verification , 2001, Odyssey.

[4]  Frédéric Bimbot,et al.  The IRISA/ELISA Speaker Detection and Tracking Systems for the NIST'99 Evaluation Campaign , 2000, Digit. Signal Process..

[5]  Guillaume Gravier,et al.  The ESTER phase II evaluation campaign for the rich transcription of French broadcast news , 2005, INTERSPEECH.

[6]  Herbert Gish,et al.  Clustering speakers by their voices , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[7]  Perrine Delacourt,et al.  Speaker-based segmentation for audio data indexing , 1999 .

[8]  Delphine Charlet,et al.  A correlation metric for speaker tracking using anchor models , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[9]  Douglas E. Sturim,et al.  Speaker indexing in large audio databases using anchor models , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[10]  Dan Istrate,et al.  Broadcast news speaker tracking for ESTER 2005 campaign , 2005, INTERSPEECH.