This paper presents the design and results of the Rich Transcription Spring 2005 (RT-05S) Meeting Recognition Evaluation. This evaluation is the third in a series of community-wide evaluations of language technologies in the meeting domain. For 2005, four evaluation tasks were supported. These included a speech-to-text (STT) transcription task and three diarization tasks: “Who Spoke When”, “Speech Activity Detection”, and “Source Localization.” The latter two were first-time experimental proof-of-concept tasks and were treated as “dry runs”. For the STT task, the lowest word error rate for the multiple distant microphone condition was 30.0% which represented an impressive 33% relative reduction from the best result obtained in the last such evaluation – the Rich Transcription Spring 2004 Meeting Recognition Evaluation. For the diarization “Who Spoke When” task, the lowest diarization error rate was 18.56% which represented a 19% relative reduction from that of RT-04S.
[1]
Susanne Burger,et al.
The ISL meeting corpus: the impact of meeting type on speech style
,
2002,
INTERSPEECH.
[2]
John S. Garofolo,et al.
THE RICH TRANSCRIPTION 2004 SPRING MEETING RECOGNITION EVALUATION
,
2004
.
[3]
Mary P. Harper,et al.
Speech Activity Detection on Multichannels of Meeting Recordings
,
2005,
MLMI.
[4]
Andreas Stolcke,et al.
The ICSI Meeting Project: Resources and Research
,
2004
.
[5]
Martial Michel,et al.
The NIST Meeting Room Pilot Corpus
,
2004,
LREC.