Where are we in transcribing French broadcast news?

Given the high flexional properties of the French language, transcribing French broadcast news (BN) is more challenging than English BN. This is in part due to the largenumber of homophones in the inflected forms. This paper describes advances in automatic processing of broadcast news speech in French based on recent improvements to the LIMSI English system. The main differences between the English and French BN systems are: a 200k vocabulary to overcome the lower lexical coverage in French (including contextual pronunciations to model liaisons), a case sensitive language model, and the use of a POS based language model to lower the impact of homophonic gender and number disagreement. The resulting system was evaluated in the first French TECHNOLANGUE-ESTERASR benchmark test. This system achieved the lowest word error rate in this evaluation by a significant margin. We also report on a 1xRT version of this system.

[1]  Guillaume Gravier,et al.  The ESTER phase II evaluation campaign for the rich transcription of French broadcast news , 2005, INTERSPEECH.

[2]  Guillaume Gravier,et al.  Corpus description of the ESTER Evaluation Campaign for the Rich Transcription of French Broadcast News , 2004, LREC.

[3]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[4]  Jean-Luc Gauvain,et al.  Neural network language models for conversational speech recognition , 2004, INTERSPEECH.

[5]  Lori Lamel,et al.  Text normalization and speech recognition in French , 1997, EUROSPEECH.

[6]  John Makhoul,et al.  THE 2004 BBN/LIMSI 10xRT ENGLISH BROADCAST NEWS TRANSCRIPTION SYSTEM , 2004 .

[7]  Martine Adda-Decker,et al.  Liaisons in French: a corpus-based study using morpho-syntactic information , 2003 .

[8]  Jean-Luc Gauvain,et al.  Transcribing Broadcast News: The LIMSI Nov96 Hub4 System , 1997 .

[9]  Jean-Luc Gauvain,et al.  Automatic processing of broadcast audio in multiple languages , 2002, 2002 11th European Signal Processing Conference.

[10]  Andreas Stolcke,et al.  Finding consensus among words: lattice-based word error minimization , 1999, EUROSPEECH.

[11]  Martine Adda-Decker,et al.  The 300k LIMSI German broadcast news transcription system , 2003, INTERSPEECH.

[12]  Lori Lamel,et al.  Speaker-independent continuous speech dictation , 1993, Speech Communication.

[13]  Jean-Luc Gauvain,et al.  Partitioning and transcription of broadcast news data , 1998, ICSLP.