A Poor Man’s Translation Memory Using Machine Translation Evaluation Metrics

Given a new sentence to translate q (the query ), the core TM functionality consists in finding the best match for q in D, i.e. the pair 〈ŝ, t〉 with maximum similarity x = f (q, ŝ); if x ≥ α, then the system outputs the target-language counterpart t of ŝ, otherwise nothing. Similarity Function f measures the similarity between two source-language strings. Typically, it produces a value between 0 and 1, where 0 means “completely different” and 1 means “identical”; α can then be in the range [0,1]. It is generally acknowledged that commercial TM systems use variants of the Levenshtein distance, e.g.:

[1]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[2]  Alon Lavie,et al.  Meteor 1.3: Automatic Metric for Reliable Optimization and Evaluation of Machine Translation Systems , 2011, WMT@EMNLP.

[3]  P. Isabelle,et al.  Phrase-based Machine Translation in a Computer-assisted Translation Environment , 2009, MTSUMMIT.

[4]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[5]  Philipp Koehn,et al.  Convergence of Translation Memory and Statistical Machine Translation , 2010, JEC.

[6]  Jörg Tiedemann,et al.  News from OPUS — A collection of multilingual parallel corpora with tools and interfaces , 2009 .

[7]  Ding Liu,et al.  Source-Language Features and Maximum Correlation Training for Machine Translation Evaluation , 2007, NAACL.

[8]  John S. White,et al.  Evaluation of Machine Translation , 1993, HLT.

[9]  Nitin Madnani,et al.  Re-examining Machine Translation Metrics for Paraphrase Identification , 2012, NAACL.

[10]  Francie Gow Metrics for Evaluating Translation Memory Software , 2003 .

[11]  Timothy Baldwin,et al.  The Effects of Word Order and Segmentation on Translation Retrieval Performance , 2000, COLING.

[12]  Chin-Yew Lin,et al.  ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation , 2004, COLING.

[13]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[14]  Daniel Jurafsky,et al.  The Best Lexical Metric for Phrase-Based Statistical MT System Optimization , 2010, NAACL.

[15]  Harold L. Somers,et al.  Evaluation metrics for a translation memory system , 1999 .

[16]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[17]  Atsushi Fujita,et al.  Enlarging Paraphrase Collections through Generalization and Instantiation , 2012, EMNLP-CoNLL.

[18]  Josef van Genabith,et al.  Integrating N-best SMT Outputs into a TM System , 2010, COLING.

[19]  Tomaz Erjavec,et al.  The JRC-Acquis: A Multilingual Aligned Parallel Corpus with 20+ Languages , 2006, LREC.

[20]  Timothy Baldwin Low-cost, High-Performance Translation Retrieval: Dumber is Better , 2001, ACL.

[21]  Philipp Koehn,et al.  Fast Approximate String Matching with Suffix Arrays and A* Parsing , 2010, AMTA.

[22]  Harold L. Somers 3. Translation memory systems , 2003 .

[23]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[24]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[25]  Nitin Madnani,et al.  TER-Plus: paraphrase, semantic, and alignment enhancements to Translation Edit Rate , 2009, Machine Translation.

[26]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[27]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.