The 2006 LIMSI Statistical Machine Translation System for TC-STAR

This paper presents the LIMSI statistical machine translation system developed for 2006 T C-STAR evaluation campaign. We describe an A*-decoder that generates translation lattices using a word-based translation model. A lattice is a rich and compact representation of alternative translations that includes the probability scores of all the involved sub-models. These lattices are then used in subsequent processing steps, in particular to perform sentence splitting and joining, maximum BLEU training and to use improved statistical target language models.

[1]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[2]  Jean-Luc Gauvain,et al.  Training Neural Network Language Models on Very Large Corpora , 2005, HLT.

[3]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[4]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[5]  José B. Mariño,et al.  Bilingual N-gram Statistical Machine Translation , 2005 .

[6]  Daniel Marcu,et al.  Fast Decoding and Optimal Decoding for Machine Translation , 2001, ACL.

[7]  Giuseppe Riccardi,et al.  Computing consensus translation from multiple machine translation systems , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[8]  Hermann Ney,et al.  An Efficient A* Search Algorithm for Statistical Machine Translation , 2001, DDMMT@ACL.

[9]  Hermann Ney,et al.  Cross domain automatic transcription on the TC-STAR EPPS corpus , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[10]  Holger Schwenk,et al.  Continuous Space Language Models for Statistical Machine Translation , 2006, ACL.

[11]  Hermann Ney,et al.  Phrase-Based Statistical Machine Translation , 2002, KI.

[12]  Frank Vanden Berghen,et al.  CONDOR, a new parallel, constrained extension of Powell's UOBYQA algorithm: experimental results and comparison with the DFO algorithm , 2005 .

[13]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[14]  Francisco Casacuberta,et al.  An Empirical Comparison of Stack-Based Decoding Algorithms for Statistical Machine Translation , 2003, IbPRIA.

[15]  Shankar Kumar,et al.  Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2004, NAACL.

[16]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.