The LIMSI 2006 TC-STAR EPPS Transcription Systems

This paper describes the speech recognizers developed to transcribe European Parliament Plenary Sessions (EPPS) in English and Spanish in the 2nd TC-STAR Evaluation Campaign. The speech recognizers are state-of-the-art systems using multiple decoding passes with models (lexicon, acoustic models, language models) trained for the different transcription tasks. Compared to the LIMSI TC-STAR 2005 EPPS systems, relative word error rate reductions of about 30% have been achieved on the 2006 development data. The word error rates with the LIMSI systems on the 2006 EPPS evaluation data are 8.2% for English and 7.8% for Spanish. Experiments with cross-site adaptation and system combination are also described.