Data Inferred Multi-word Expressions for Statistical Machine Translation

This paper presents a strategy for detecting and using multi-word expressions in Statistical Machine Translation. Performance of the proposed strategy is evaluated in terms of alignment quality as well as translation accuracy. Evaluations are performed by using the Verbmobil corpus. Results from translation tasks from English-to-Spanish and from Spanish-to-English are presented and discussed.

[1]  José B. Mariño,et al.  Using x-grams for speech-to-speech translation , 2002, INTERSPEECH.

[2]  Patrik Lambert,et al.  Alignment of Parallel Corpora Exploiting Asymmetrically Aligned Phrases , 2006 .

[3]  Hermann Ney,et al.  Phrase-Based Statistical Machine Translation , 2002, KI.

[4]  José A. R. Fonollosa,et al.  Improving Phrase-Based Statistical Translation by Modifying Phrase Extraction and Including Several Features , 2005, ParallelText@ACL.

[5]  José B. Mariño,et al.  TALP: Xgram-based spoken language translation system , 2004, IWSLT.

[6]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[7]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[8]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[9]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[10]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[11]  Victoria Arranz Development of Language Resources for Speech-to-speech Translation , 2007 .

[12]  José B. Mariño,et al.  Phrase-based alignment combining corpus cooccurrences and linguistic knowledge , 2004, IWSLT.

[13]  José B. Mariño,et al.  Bilingual N-gram Statistical Machine Translation , 2005 .

[14]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[15]  José B. Mariño,et al.  An n-gram-based statistical machine translation decoder , 2005, INTERSPEECH.

[16]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.