The Universitat d'Alacant Submissions to the English-to-Kazakh News Translation Task at WMT 2019

This paper describes the two submissions of Universitat d’Alacant to the English-toKazakh news translation task at WMT 2019. Our submissions take advantage of monolingual data and parallel data from other language pairs by means of iterative backtranslation, pivot backtranslation and transfer learning. They also use linguistic information in two ways: morphological segmentation of Kazakh text, and integration of the output of a rule-based machine translation system. Our systems were ranked 2nd in terms of chrF++ despite being built from an ensemble of only 2 independent training runs.

[1]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[2]  Rico Sennrich,et al.  The University of Edinburgh’s Neural MT Systems for WMT17 , 2017, WMT.

[3]  Kevin Knight,et al.  Multi-Source Neural Translation , 2016, NAACL.

[4]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[5]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[6]  Marcin Junczys-Dowmunt,et al.  MS-UEdin Submission to the WMT2018 APE Shared Task: Dual-Source Transformer for Automatic Post-Editing , 2018, WMT.

[7]  Jindrich Libovický,et al.  Input Combination Strategies for Multi-Source Transformer Decoder , 2018, WMT.

[8]  Hermann Ney,et al.  On structuring probabilistic dependences in stochastic language modelling , 1994, Comput. Speech Lang..

[9]  Maja Popovic,et al.  chrF++: words helping character n-grams , 2017, WMT.

[10]  Mikko Kurimo,et al.  Morfessor 2.0: Python Implementation and Extensions for Morfessor Baseline , 2013 .

[11]  Chenhui Chu,et al.  An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation , 2017, ACL.

[12]  Francis Tyers,et al.  A free/open-source machine translation system for English to Kazakh , 2015 .

[13]  Alfons Juan-Císcar,et al.  The MLLP-UPV German-English Machine Translation System for WMT18 , 2018, WMT.

[14]  Kenneth Heafield,et al.  KenLM: Faster and Smaller Language Model Queries , 2011, WMT@EMNLP.

[15]  H. Ney,et al.  Pivot Lightly-Supervised Training for Statistical Machine Translation , 2012, AMTA.

[16]  Víctor M. Sánchez-Cartagena,et al.  Prompsit’s submission to WMT 2018 Parallel Corpus Filtering shared task , 2018, WMT.

[17]  Francis M. Tyers,et al.  Apertium: a free/open-source platform for rule-based machine translation , 2011, Machine Translation.

[18]  Alexander M. Fraser,et al.  Target-side Word Segmentation Strategies for Neural Machine Translation , 2017, WMT.

[19]  Antonio Toral,et al.  Abu-MaTran at WMT 2016 Translation Task: Deep Learning, Morphological Segmentation and Tuning on Character Sequences , 2016, WMT.

[20]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[21]  Ondrej Bojar,et al.  Trivial Transfer Learning for Low-Resource Neural Machine Translation , 2018, WMT.

[22]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[23]  Rico Sennrich,et al.  Edinburgh Neural Machine Translation Systems for WMT 16 , 2016, WMT.

[24]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[25]  Gholamreza Haffari,et al.  Iterative Back-Translation for Neural Machine Translation , 2018, NMT@ACL.

[26]  Ondrej Bojar,et al.  Results of the WMT17 Metrics Shared Task , 2017, WMT.

[27]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.