A Smorgasbord of Features to Combine Phrase-Based and Neural Machine Translation

Superiority of neural machine translation (NMT) and phrase-based statistical machine translation (PBSMT) depends on the translation task. For some translation tasks, such as those involving low-resource language pairs or close languages, NMT may underperform PBSMT. In order to have a translation system that performs consistently better regardless of the translation task, recent work proposed to combine PBSMT and NMT approaches. In this paper, we propose an empirical comparison of the most popular existing approaches that combine PBSMT and NMT. Despite its simplicity, our simple reranking system using a smorgasbord of informative features significantly and consistently outperforms other methods, even for translation tasks where PBSMT and NMT produce translations of a very different quality.

[1]  Richard M. Schwartz,et al.  Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[2]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[3]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[4]  Hua Wu,et al.  Improved Neural Machine Translation with SMT Features , 2016, AAAI.

[5]  Stephan Vogel,et al.  Combination of Machine Translation Systems via Hypothesis Selection from Combined N-Best Lists , 2008, AMTA 2008.

[6]  Markus Freitag,et al.  Jane: Open Source Machine Translation System Combination , 2014, EACL.

[7]  Hermann Ney,et al.  Word-Level Confidence Estimation for Machine Translation , 2007, CL.

[8]  Philipp Koehn,et al.  Scalable Modified Kneser-Ney Language Model Estimation , 2013, ACL.

[9]  Jiajun Zhang,et al.  Neural System Combination for Machine Translation , 2017, ACL.

[10]  Giuseppe Riccardi,et al.  Computing consensus translation from multiple machine translation systems , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[11]  Alon Lavie,et al.  Meteor Universal: Language Specific Translation Evaluation for Any Target Language , 2014, WMT@ACL.

[12]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[13]  Philipp Koehn,et al.  Six Challenges for Neural Machine Translation , 2017, NMT@ACL.

[14]  Rico Sennrich,et al.  The University of Edinburgh’s Neural MT Systems for WMT17 , 2017, WMT.

[15]  Gregory Shakhnarovich,et al.  A Systematic Exploration of Diversity in Machine Translation , 2013, EMNLP.

[16]  Maja Popovic,et al.  chrF++: words helping character n-grams , 2017, WMT.

[17]  Alexander M. Fraser,et al.  A Smorgasbord of Features for Statistical Machine Translation , 2004, NAACL.

[18]  Rico Sennrich,et al.  Nematus: a Toolkit for Neural Machine Translation , 2017, EACL.

[19]  Alexandre Allauzen,et al.  Continuous Space Translation Models with Neural Networks , 2012, NAACL.

[20]  Hermann Ney,et al.  Minimum Bayes Risk Decoding for BLEU , 2007, ACL.

[21]  Arianna Bisazza,et al.  Neural versus Phrase-Based Machine Translation Quality: a Case Study , 2016, EMNLP.

[22]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[23]  Rico Sennrich,et al.  The AMU-UEDIN Submission to the WMT16 News Translation Task: Attention-based NMT Models as Feature Functions in Phrase-based SMT , 2016, WMT.

[24]  Alon Lavie,et al.  CMU System Combination in WMT 2011 , 2011, WMT@EMNLP.

[25]  Eiichiro Sumita,et al.  Overview of the Patent Machine Translation Task at the NTCIR-10 Workshop , 2011, NTCIR.

[26]  Wei Chen,et al.  Sogou Neural Machine Translation Systems for WMT17 , 2017, WMT.

[27]  Noah A. Smith,et al.  A Simple, Fast, and Effective Reparameterization of IBM Model 2 , 2013, NAACL.

[28]  Satoshi Nakamura,et al.  Improving Neural Machine Translation through Phrase-based Forced Decoding , 2017, IJCNLP.

[29]  George F. Foster,et al.  Batch Tuning Strategies for Statistical Machine Translation , 2012, NAACL.

[30]  Jan Niehues,et al.  Pre-Translation for Neural Machine Translation , 2016, COLING.