Submissions to the IWSLT 2016 MT Track

We present our submissions to the IWSLT 2016 machine translation task, as our first attempt to translate subtitles and one of our early experiments with neural machine translation (NMT). We focus primarily on English→Czech translation direction but perform also basic adaptation experiments for NMT with German and also the reverse direction. Three MT systems are tested: (1) our Chimera, a tight combination of phrase-based MT and deep linguistic processing, (2) Neural Monkey, our implementation of a NMT system in TensorFlow and (3) Nematus, an established NMT system.

[1]  Ondrej Dusek,et al.  DEPFIX: A System for Automatic Correction of Czech MT Outputs , 2012, WMT@NAACL-HLT.

[2]  Alexander M. Fraser,et al.  CUNI-LMU Submissions in WMT2016: Chimera Constrained and Beaten , 2016, WMT.

[3]  Karin M. Verspoor,et al.  Findings of the 2016 Conference on Machine Translation , 2016, WMT.

[4]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[5]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[6]  Rico Sennrich,et al.  Edinburgh Neural Machine Translation Systems for WMT 16 , 2016, WMT.

[7]  Jan Hajic,et al.  Open-Source Tools for Morphology, Lemmatization, POS Tagging and Named Entity Recognition , 2014, ACL.

[8]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[9]  Razvan Pascanu,et al.  How to Construct Deep Recurrent Neural Networks , 2013, ICLR.

[10]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[11]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[12]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[13]  Jindřich Helcl,et al.  CUNI System for WMT16 Automatic Post-Editing and Multimodal Translation Tasks , 2016, WMT.

[14]  Ondrej Dusek,et al.  CzEng 1.6: Enlarged Czech-English Parallel Corpus with Processing Tools Dockered , 2016, TSD.

[15]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[16]  Anthony Rousseau,et al.  XenC: An Open-Source Tool for Data Selection in Natural Language Processing , 2013, Prague Bull. Math. Linguistics.

[17]  Ondrej Dusek,et al.  Formemes in English-Czech Deep Syntactic MT , 2012, WMT@NAACL-HLT.

[18]  Rudolf Rosa,et al.  Chimera - Three Heads for English-to-Czech Translation , 2013, WMT@ACL.