The AFRL-MITLL WMT17 Systems: Old, New, Borrowed, BLEU

This paper describes the AFRL-MITLL machine translation systems and the improvements that were developed during the WMT17 evaluation campaign. This year, we explore the continuing proliferation of Neural Machine Translation toolkits, revisit our previous data-selection efforts for use in training systems with these new toolkits and expand our participation to the Russian–English, Turkish–English and Chinese–English translation pairs.

[1]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[2]  Tibor Kiss,et al.  Unsupervised Multilingual Sentence Boundary Detection , 2006, CL.

[3]  Lane Schwartz,et al.  Fast, Scalable Phrase-Based SMT Decoding , 2016, AMTA.

[4]  Philipp Koehn,et al.  Dirt Cheap Web-Scale Parallel Text from the Common Crawl , 2013, ACL.

[5]  Rico Sennrich,et al.  Edinburgh Neural Machine Translation Systems for WMT 16 , 2016, WMT.

[6]  Xiaoyi Ma,et al.  Champollion: A Robust Parallel Text Sentence Aligner , 2006, LREC.

[7]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[8]  Marcin Junczys-Dowmunt,et al.  Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions , 2016, IWSLT.

[9]  Rico Sennrich,et al.  Nematus: a Toolkit for Neural Machine Translation , 2017, EACL.

[10]  Jeremy Gwinnup,et al.  Drem: The AFRL Submission to the WMT15 Tuning Task , 2015, WMT@EMNLP.

[11]  Jeremy Gwinnup,et al.  A Taxonomy of Weeds: A Field Guide for Corpus Curators to Winnowing the Parallel Text Harvest , 2016, AMTA.

[12]  Markus Freitag,et al.  Jane: Open Source Machine Translation System Combination , 2014, EACL.

[13]  Eric G. Hansen,et al.  The MITLL-AFRL IWSLT 2016 Systems , 2016, IWSLT.

[14]  Lane Schwartz,et al.  Machine Translation and Monolingual Postediting: The AFRL WMT-14 System , 2014, WMT@ACL.

[15]  Kenneth Heafield,et al.  KenLM: Faster and Smaller Language Model Queries , 2011, WMT@EMNLP.

[16]  Nadir Durrani,et al.  A Joint Sequence Translation Model with Integrated Reordering , 2011, ACL.

[17]  Marcin Junczys-Dowmunt,et al.  The United Nations Parallel Corpus v1.0 , 2016, LREC.

[18]  Christopher D. Manning,et al.  A Simple and Effective Hierarchical Phrase Reordering Model , 2008, EMNLP.

[19]  William D. Lewis,et al.  Intelligent Selection of Language Model Training Data , 2010, ACL.

[20]  Alexandru Ceausu,et al.  South-East European Times : A parallel corpus of Balkan languages , Francis Tyers and , 2010 .

[21]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[22]  Christopher D. Manning,et al.  Optimizing Chinese Word Segmentation for Machine Translation Performance , 2008, WMT@ACL.