Machine Translation and Monolingual Postediting: The AFRL WMT-14 System

This paper describes the AFRL statistical MT system and the improvements that were developed during the WMT14 evaluation campaign. As part of these efforts we experimented with a number of extensions to the standard phrase-based model that improve performance on Russian to English and Hindi to English translation tasks. In addition, we describe our efforts to make use of monolingual English speakers to correct the output of machine translation, and present the results of monolingual postediting of the entire 3003 sentences of the WMT14 Russian-English test set.

[1]  Philipp Koehn,et al.  A process study of computer-aided translation , 2009, Machine Translation.

[2]  Philipp Koehn,et al.  Findings of the 2014 Workshop on Statistical Machine Translation , 2014, WMT@ACL.

[3]  Christopher D. Manning,et al.  A Simple and Effective Hierarchical Phrase Reordering Model , 2008, EMNLP.

[4]  Helmut Schmid,et al.  Improvements in Part-of-Speech Tagging with an Application to German , 1999 .

[5]  Hermann Ney,et al.  Statistical Approaches to Computer-Assisted Translation , 2009, CL.

[6]  Alon Lavie,et al.  The CMU Machine Translation Systems at WMT 2013: Syntax, Synthetic Translation Options, and Pseudo-References , 2013, WMT@ACL.

[7]  Leah S. Larkey,et al.  Hindi CLIR in thirty days , 2003, TALIP.

[8]  Stephan Vogel,et al.  Parallel Implementations of Word Alignment Tool , 2008, SETQALNLP.

[9]  Philipp Koehn,et al.  Enabling Monolingual Translators: Post-Editing vs. Options , 2010, NAACL.

[10]  Ondrej Bojar,et al.  Data Issues in English-to-Hindi Machine Translation , 2010, LREC.

[11]  Timothy R. Anderson,et al.  The MIT-LL/AFRL IWSLT-2006 MT system , 2006, IWSLT.

[12]  Nadir Durrani,et al.  A Joint Sequence Translation Model with Integrated Reordering , 2011, ACL.

[13]  G. Elisabeta Marai,et al.  Correcting Automatic Translations through Collaborations between MT and Monolingual Target-\-Lan\-gua\-ge Users , 2009, EACL.

[14]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[15]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[16]  Christoph Tillmann,et al.  A Unigram Orientation Model for Statistical Machine Translation , 2004, NAACL.

[17]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[18]  Mark Hopkins,et al.  Tuning as Ranking , 2011, EMNLP.

[19]  Philipp Koehn,et al.  Findings of the 2013 Workshop on Statistical Machine Translation , 2013, WMT@ACL.

[20]  Alon Lavie,et al.  The CMU Machine Translation Systems at WMT 2014 , 2014, WMT@ACL.

[21]  Philipp Koehn,et al.  An Experimental Management System , 2010, Prague Bull. Math. Linguistics.

[22]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[23]  George F. Foster,et al.  TransType: a Computer-Aided Translation Typing System , 2000 .

[24]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[25]  Jeffrey Heer,et al.  The efficacy of human post-editing for language translation , 2013, CHI.

[26]  Chris Callison-Burch Linear B System Description for the 2005 NIST MT Evaluation Exercise , 2005 .

[27]  Linda Mitchell,et al.  Community-based post-editing of machine-translated content: monolingual vs. bilingual , 2013, MTSUMMIT.

[28]  Philipp Koehn,et al.  Scalable Modified Kneser-Ney Language Model Estimation , 2013, ACL.

[29]  Philipp Koehn,et al.  Dirt Cheap Web-Scale Parallel Text from the Common Crawl , 2013, ACL.

[30]  François Masselot,et al.  A Productivity Test of Statistical Machine Translation Post-Editing in a Typical Localisation Context , 2010, Prague Bull. Math. Linguistics.

[31]  Ondrej Bojar,et al.  HindEnCorp - Hindi-English and Hindi-only Corpus for Machine Translation , 2014, LREC.

[32]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[33]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[34]  Philipp Koehn,et al.  A Web-Based Interactive Computer Aided Translation Tool , 2009, ACL.