论文信息 - Automatic named identification of speakers using diarization and ASR systems

Automatic named identification of speakers using diarization and ASR systems

In this paper, we consider the extraction of speaker identity from audio records of broadcast news without a priori acoustic information about speakers. Using an automatic speech recognition system and an automatic speaker diarization system, we present improvements for a method which allows to extract speaker identities from automatic transcripts and to assign them to speech segments. Experiments are carried out on French broadcast news records from the ESTER 1 evaluation campaign. Experimental results using outputs of automatic speech recognition and automatic diarization are presented.

Sylvain Meignier | Yannick Estève | Christine Jacquin | Simon Petit-Renaud | Vincent Jousse

[1] Sue Tranter. Who Really Spoke When? Finding Speaker Turns and Identities in Broadcast News Audio , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[2] Patrick Nguyen,et al. Finding Speaker Identities with a Conditional Maximum Entropy Model , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[3] Paul Deléglise,et al. Extracting true speaker identities from transcriptions , 2007, INTERSPEECH.

[4] L. Lamel,et al. A comparative study using manual and automatic transcriptions for diarization , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[5] Julie Mauclair,et al. Speaker Diarization: About whom the Speaker is Talking ? , 2006, 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop.

[6] Douglas A. Reynolds,et al. A Tutorial on Text-Independent Speaker Verification , 2004, EURASIP J. Adv. Signal Process..

[7] Jean-François Bonastre,et al. Step-by-step and integrated approaches in broadcast news speaker diarization , 2006, Comput. Speech Lang..

[8] Paul Deléglise,et al. The LIUM speech transcription system: a CMU Sphinx III-based system for French broadcast news , 2005, INTERSPEECH.

[9] Guillaume Gravier,et al. The ESTER phase II evaluation campaign for the rich transcription of French broadcast news , 2005, INTERSPEECH.