论文信息 - LIG statistical machine translation systems for IWSLT 2010

LIG statistical machine translation systems for IWSLT 2010

This paper describes the systems developed by the LIG laboratory for the 2010 IWSLT evaluation. We participated to the AE BTEC task and to the new TALK task. For AE BTEC task we developed two different systems: a statistical phrase-based system and a hierarchical phrasebased system using the Moses toolkit. The combination of these systems, which improves the results on different development sets, makes our final submission. This year, we concentrated on the new TALK task. The development of a reference translation system, as well as an ASR output translation system, is presented. For this latter task, re-punctuating the ASR output, before translation, seems to be very useful, while segmenting the ASR flow, which is also discussed in this paper, has shown to be less useful. Unsuccessful attempts to exploit ASR lattices instead of ASR 1best are also presented at the end of this article.

Hervé Blanchon | Laurent Besacier | Thi-Ngoc-Diep Do | Haithem Afli | Marion Potet

[1] Joel D. Martin,et al. Improving Translation Quality by Discarding Most of the Phrasetable , 2007, EMNLP.

[2] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[3] Marcello Federico,et al. Punctuating confusion networks for speech translation , 2007, INTERSPEECH.

[4] Hervé Blanchon,et al. The LIG Arabic/English speech translation system at IWSLT08 , 2007, IWSLT.

[5] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[6] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[7] Daniel Jurafsky,et al. Automatic Tagging of Arabic Text: From Raw Text to Base Phrase Chunks , 2004, NAACL.

[8] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[9] Timothy R. Anderson,et al. The MIT-LL/AFRL IWSLT-2006 MT system , 2006, IWSLT.

[10] Holger Schwenk,et al. Translation Model Adaptation by Resampling , 2010, WMT@ACL.

[11] Hermann Ney,et al. Evaluating Machine Translation Output with Automatic Sentence Segmentation , 2005, IWSLT.

[12] Robert L. Mercer,et al. The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[13] Haizhou Li,et al. I2r multi-pass machine translation system for IWSLT 2008 , 2008, IWSLT.

[14] Loïc Barrault,et al. MANY: Open Source MT System Combination at WMT’10 , 2010, WMT@ACL.