The ELISA consortium approaches in broadcast news speaker segmentation during the NIST 2003 rich transcription evaluation

The paper presents the ELISA consortium activities in automatic speaker segmentation, also known as speaker diarization, during the NIST rich transcription (RT), 2003, evaluation. The experiments were conducted on real broadcast news data (HUB4). Two different approaches from the CLIPS and LIA laboratories are presented and different possibilities of combining them are investigated, in the framework of the ELISA consortium. The system submitted as an ELISA primary system obtained the second lowest segmentation error rate compared to the other RT03-participant primary systems. Another ELISA system submitted as a secondary system outperformed the best primary system and obtained the lowest speaker segmentation error rate.

[1]  Jean-François Bonastre,et al.  Evolutive HMM for multi-speaker tracking system , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[2]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[3]  Jean-François Bonastre,et al.  E-HMM approach for learning and adapting sound models for speaker indexing , 2001, Odyssey.

[4]  Christian Wellekens,et al.  DISTBIC: A speaker-based segmentation for audio data indexing , 2000, Speech Commun..

[5]  Jean-François Bonastre,et al.  AMIRAL: A Block-Segmental Multirecognizer Architecture for Automatic Speaker Recognition , 2000, Digit. Signal Process..

[6]  Guillaume Gravier,et al.  Overview of the 2000-2001 ELISA Consortium research activities , 2001, Odyssey.

[7]  Sylvain Meignier,et al.  The ELISA consortium approaches in speaker segmentation during the NIST 2002 speaker recognition evaluation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..