The JHU Machine Translation Systems for WMT 2016

This paper describes the submission of Johns Hopkins University for the shared translation task of ACL 2016 First Conference on Machine Translation (WMT 2016). We set up phrase-based, hierarchical phrase-based and syntax-based systems for all 12 language pairs of this year’s evaluation campaign. Novel research directions we investigated include: neural probabilistic language models, bilingual neural network language models, morphological segmentation, and the attentionbased neural machine translation model as reranking feature.

[1]  Nadir Durrani,et al.  Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT? , 2013, ACL.

[2]  Richard M. Schwartz,et al.  Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[3]  David Chiang,et al.  Forest Rescoring: Faster Decoding with Integrated Language Models , 2007, ACL.

[4]  Christopher D. Manning,et al.  A Simple and Effective Hierarchical Phrase Reordering Model , 2008, EMNLP.

[5]  Philipp Koehn,et al.  Factored Translation Models , 2007, EMNLP.

[6]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[7]  Marcin Junczys-Dowmunt A Phrase Table without Phrases: Rank Encoding for Better Phrase Table Compression , 2012, EAMT.

[8]  George F. Foster,et al.  Batch Tuning Strategies for Statistical Machine Translation , 2012, NAACL.

[9]  Matt Post,et al.  Joshua 6: A phrase-based and hierarchical statistical machine translation system , 2015, Prague Bull. Math. Linguistics.

[10]  Philipp Koehn,et al.  Empirical Methods for Compound Splitting , 2003, EACL.

[11]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[12]  Kevin Knight,et al.  11,001 New Features for Statistical Machine Translation , 2009, NAACL.

[13]  Kenneth Heafield,et al.  N-gram Counts and Language Models from the Common Crawl , 2014, LREC.

[14]  Gholamreza Haffari,et al.  Incorporating Structural Alignment Biases into an Attentional Neural Translation Model , 2016, NAACL.

[15]  Philipp Koehn,et al.  The Edinburgh/JHU Phrase-based Machine Translation Systems for WMT 2015 , 2015, WMT@EMNLP.

[16]  Franz Josef Och,et al.  An Efficient Method for Determining Bilingual Word Classes , 1999, EACL.

[17]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[18]  Alexander M. Fraser,et al.  Labeled Morphological Segmentation with Semi-Markov Models , 2015, CoNLL.

[19]  Christian Chiarcos,et al.  A New Hybrid Dependency Parser for German , 2009 .

[20]  George F. Foster,et al.  Coarse “split and lump” bilingual language models for richer source information in SMT , 2014, AMTA.

[21]  Kenneth Heafield,et al.  KenLM: Faster and Smaller Language Model Queries , 2011, WMT@EMNLP.

[22]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[23]  Lucian Vlad Lita,et al.  tRuEcasIng , 2003, ACL.

[24]  Mikko Kurimo,et al.  Morfessor 2.0: Python Implementation and Extensions for Morfessor Baseline , 2013 .

[25]  Ashish Vaswani,et al.  Decoding with Large-Scale Neural Language Models Improves Translation , 2013, EMNLP.

[26]  Philipp Koehn,et al.  Edinburgh’s Submission to all Tracks of the WMT 2009 Shared Task with Reordering and Speed Improvements to Moses , 2009, WMT@EACL.

[27]  Phil Blunsom,et al.  Probabilistic Inference for Machine Translation , 2008, EMNLP.

[28]  Philip Gage,et al.  A new algorithm for data compression , 1994 .

[29]  Shankar Kumar,et al.  Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2004, NAACL.

[30]  Philipp Koehn,et al.  Edinburgh's Syntax-Based Systems at WMT 2014 , 2014, WMT@ACL.