论文信息 - Speaker Diarization: About whom the Speaker is Talking ?

Speaker Diarization: About whom the Speaker is Talking ?

The automatic speaker diarization consists in splitting the signal into homogeneous segments and clustering them by speakers. However the speaker segments are specified with anonymous labels. This paper suggests a solution to identify those speakers by extracting their full names pronounced in French broadcast news. A semantic classification tree is automatically built on a training corpus and associate the full names detected in the transcription of a segment to this segment or to one of its neighbors. Then, a merging method permits to associate a full name to a speaker cluster instead of an anonymous label provided by the diarization. The experiments are carried out over French broadcast news records from the ESTER 2005 evaluation campaign. About 70% show duration is correctly processed for both development and evaluation corpora. On the evaluation corpus, 18.2% show duration is wrongly named and no decision is taken for 11.9% show duration

Julie Mauclair | Sylvain Meignier | Yannick Estève

[1] Douglas A. Reynolds,et al. A Tutorial on Text-Independent Speaker Verification , 2004, EURASIP J. Adv. Signal Process..

[2] Frédéric Béchet,et al. Stochastic finite state automata language model triggered by dialogue states , 2001, INTERSPEECH.

[3] Frédéric Bimbot,et al. Speaker diarization using bottom-up clustering based on a parameter-derived distance between adapted GMMs , 2004, INTERSPEECH.

[4] Frédéric Béchet,et al. Tagging Unknown Proper Names Using Decision Trees , 2000, ACL.

[5] Guillaume Gravier,et al. The ESTER phase II evaluation campaign for the rich transcription of French broadcast news , 2005, INTERSPEECH.

[6] Jean-Luc Gauvain,et al. Improving Speaker Diarization , 2004 .

[7] Guillaume Gravier,et al. Corpus description of the ESTER Evaluation Campaign for the Rich Transcription of French Broadcast News , 2004, LREC.

[8] Renato De Mori,et al. The Application of Semantic Classification Trees to Natural Language Understanding , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[9] Jitendra Ajmera,et al. A robust speaker clustering algorithm , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[10] L. Lamel,et al. A comparative study using manual and automatic transcriptions for diarization , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[11] Fall 2004 Rich Transcription ( RT-04 F ) Evaluation Plan , .

[12] Douglas A. Reynolds,et al. Speaker diarisation for broadcast news , 2004, Odyssey.

[13] Jean-Luc Gauvain,et al. Speaker diarization from speech transcripts , 2004, INTERSPEECH.

[14] Jean-François Bonastre,et al. Step-by-step and integrated approaches in broadcast news speaker diarization , 2006, Comput. Speech Lang..