The RWTH Machine Translation System for WMT 2009

RWTH participated in the shared translation task of the Fourth Workshop of Statistical Machine Translation (WMT 2009) with the German-English, French-English and Spanish-English pair in each translation direction. The submissions were generated using a phrase-based and a hierarchical statistical machine translation systems with appropriate morpho-syntactic enhancements. pos-based reorderings of the source language for the phrase-based systems and splitting of German compounds for both systems were applied. For some tasks, a system combination was used to generate a final hypothesis. An additional English hypothesis was produced by combining all three final systems for translation into English.

[1]  Hermann Ney,et al.  POS-based Word Reorderings for Statistical Machine Translation , 2006, LREC.

[2]  Nizar Habash,et al.  Combination of Arabic Preprocessing Schemes for Statistical Machine Translation , 2006, ACL.

[3]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[4]  Hermann Ney,et al.  Phrase-Based Statistical Machine Translation , 2002, KI.

[5]  Hermann Ney,et al.  Chunk-Level Reordering of Source Language Sentences with Automatically Learned Rules for Statistical Machine Translation , 2007, SSST@HLT-NAACL.

[6]  Hermann Ney,et al.  The RWTH statistical machine translation system for the IWSLT 2006 evaluation , 2006, IWSLT.

[7]  Hermann Ney,et al.  Alignment templates: the RWTH SMT system , 2004, IWSLT.

[8]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[9]  Frank Vanden Berghen,et al.  CONDOR, a new parallel, constrained extension of Powell's UOBYQA algorithm: experimental results and comparison with the DFO algorithm , 2005 .

[10]  Qun Liu,et al.  Chinese Lexical Analysis Using Hierarchical Hidden Markov Model , 2003, SIGHAN.

[11]  Hermann Ney,et al.  Automatic sentence segmentation and punctuation prediction for spoken language translation , 2006, IWSLT.

[12]  Hermann Ney,et al.  Bayesian Semi-Supervised Chinese Word Segmentation for Statistical Machine Translation , 2008, COLING.

[13]  Hermann Ney,et al.  Novel Reordering Approaches in Phrase-Based Statistical Machine Translation , 2005, ParallelText@ACL.

[14]  Hermann Ney,et al.  Computing Consensus Translation for Multiple Machine Translation Systems Using Enhanced Hypothesis Alignment , 2006, EACL.

[15]  Hermann Ney,et al.  Improved Word Alignment Using a Symmetric Lexicon Model , 2004, COLING.

[16]  Richard M. Schwartz,et al.  Combining Outputs from Multiple Machine Translation Systems , 2007, NAACL.

[17]  Francisco Casacuberta,et al.  Machine Translation with Inferred Stochastic Finite-State Transducers , 2004, Computational Linguistics.

[18]  Hermann Ney,et al.  Integrated Chinese Word Segmentation in Statistical Machine Translation , 2005, IWSLT.

[19]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[20]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[21]  Xavier Carreras,et al.  FreeLing: An Open-Source Suite of Language Analyzers , 2004, LREC.

[22]  Philipp Koehn,et al.  Empirical Methods for Compound Splitting , 2003, EACL.

[23]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[24]  Khalil Sima'an,et al.  Smoothing a Lexicon-based POS Tagger for Arabic and Hebrew , 2007, SEMITIC@ACL.

[25]  William H. Press,et al.  Numerical recipes in C , 2002 .

[26]  Thorsten Brants,et al.  TnT – A Statistical Part-of-Speech Tagger , 2000, ANLP.

[27]  Hermann Ney,et al.  Analysing soft syntax features and heuristics for hierarchical phrase based machine translation. , 2008, IWSLT.

[28]  Hermann Ney,et al.  Statistical Machine Translation of German Compound Words , 2006, FinTAL.

[29]  Hermann Ney,et al.  Partitioning Parallel Documents Using Binary Segmentation , 2006, WMT@HLT-NAACL.

[30]  Nizar Habash,et al.  Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop , 2005, ACL.

[31]  Hermann Ney,et al.  Sentence segmentation using IBM word alignment model 1 , 2005, EAMT.

[32]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[33]  José B. Mariño,et al.  N-gram-based Machine Translation , 2006, CL.

[34]  Hermann Ney,et al.  The RWTH machine translation system for IWSLT 2008. , 2008, IWSLT.

[35]  Eiichiro Sumita,et al.  Toward a Broad-coverage Bilingual Corpus for Speech Translation of Travel Conversations in the Real World , 2002, LREC.

[36]  Hermann Ney,et al.  Spoken language translation systems ************ ASR word lattice translation with exhaustive reordering is possible , 2008, INTERSPEECH.

[37]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[38]  Hermann Ney,et al.  Improved Alignment Models for Statistical Machine Translation , 1999, EMNLP.

[39]  Hermann Ney,et al.  Improved chunk-level reordering for statistical machine translation , 2007, IWSLT.