The University of Helsinki submissions to the WMT18 news task

This paper describes the University of Helsinki’s submissions to the WMT18 shared news translation task for English-Finnish and English-Estonian, in both directions. This year, our main submissions employ a novel neural architecture, the Transformer, using the open-source OpenNMT framework. Our experiments couple domain labeling and fine tuned multilingual models with shared vocabularies between the source and target language, using the provided parallel data of the shared task and additional back-translations. Finally, we compare, for the English-to-Finnish case, the effectiveness of different machine translation architectures, starting from a rule-based approach to our best neural model, analyzing the output and highlighting future research.

[1]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[2]  Philipp Koehn,et al.  Findings of the 2018 Conference on Machine Translation (WMT18) , 2018, WMT.

[3]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[4]  Jörg Tiedemann,et al.  Character-Based Pivot Translation for Under-Resourced Languages and Domains , 2012, EACL.

[5]  Arvi Hurskainen Implementing location in English to Finnish Machine Translation , 2018 .

[6]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[7]  Jörg Tiedemann,et al.  Rule-based Machine translation from English to Finnish , 2017, WMT.

[8]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[9]  Mark Fishel,et al.  Multi-Domain Neural Machine Translation , 2018, EAMT.

[10]  André F. T. Martins,et al.  Marian: Fast Neural Machine Translation in C++ , 2018, ACL.

[11]  Deniz Yuret,et al.  Transfer Learning for Low-Resource Neural Machine Translation , 2016, EMNLP.

[12]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[13]  Jörg Tiedemann,et al.  Billions of Parallel Words for Free: Building and Using the EU Bookshop Corpus , 2014, LREC.

[14]  Jörg Tiedemann,et al.  The Helsinki Neural Machine Translation System , 2017, WMT.

[15]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[16]  Jörg Tiedemann,et al.  Phrase-Based SMT for Finnish with More Data, Better Models and Alternative Alignment and Translation Tools , 2016, WMT.

[17]  Alexander M. Fraser,et al.  Modeling Target-Side Inflection in Neural Machine Translation , 2017, WMT.

[18]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.