Convergence of Translation Memory and Statistical Machine Translation

We present two methods that merge ideas from statistical machine translation (SMT) and translation memories (TM). We use a TM to retrieve matches for source segments, and replace the mismatched parts with instructions to an SMT system to fill in the gap. We show that for fuzzy matches of over 70%, one method outperforms both SMT and TM baselines.

[1]  Philipp Koehn,et al.  A unified framework for phrase-based, hierarchical, and syntax-based statistical machine translation , 2009, IWSLT.

[2]  Marc Dymetman,et al.  Dynamic Translation Memory: Using Statistical Machine Translation to Improve Translation Memory Fuzzy Matches , 2008, CICLing.

[3]  J. Smith,et al.  EBMT for SMT : A New EBMT-SMT Hybrid , 2009 .

[4]  Adam Lopez,et al.  Hierarchical Phrase-Based Translation with Suffix Arrays , 2007, EMNLP.

[5]  Philipp Koehn,et al.  462 Machine Translation Systems for Europe , 2009, MTSUMMIT.

[6]  Lucia Specia,et al.  Improving the Confidence of Machine Translation Quality Estimates , 2009, MTSUMMIT.

[7]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[8]  P. Isabelle,et al.  Phrase-based Machine Translation in a Computer-assisted Translation Environment , 2009, MTSUMMIT.

[9]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[10]  Radu Soricut,et al.  TrustRank: Inducing Trust in Automatic Translations via Ranking , 2010, ACL.

[11]  Josef van Genabith,et al.  Integrating N-best SMT Outputs into a TM System , 2010, COLING.

[12]  Tomaz Erjavec,et al.  The JRC-Acquis: A Multilingual Aligned Parallel Corpus with 20+ Languages , 2006, LREC.

[13]  Philip Koehn,et al.  Statistical Machine Translation , 2010, EAMT.

[14]  Josef van Genabith,et al.  Seeding Statistical Machine Translation with Translation Memory Output through Tree-Based Structural Alignment , 2010, SSST@COLING.

[15]  Hermann Ney,et al.  Improved Alignment Models for Statistical Machine Translation , 1999, EMNLP.

[16]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[17]  Philipp Koehn,et al.  Statistical Post-Editing on SYSTRAN‘s Rule-Based Translation System , 2007, WMT@ACL.