NTT Neural Machine Translation Systems at WAT 2017

In this paper, we describe our systems that were submitted to the translation shared tasks at WAT 2019. This year, we participated in two distinct types of subtasks, a scientific paper subtask and a timely disclosure subtask, where we only considered English-to-Japanese and Japanese-to-English translation directions. We submitted two systems (En-Ja and Ja-En) for the scientific paper subtask and two systems (Ja-En, texts, items) for the timely disclosure subtask. Three of our four systems obtained the best human evaluation performances. We also confirmed that our new additional web-crawled parallel corpus improves the performance in unconstrained settings.

[1]  Graham Neubig,et al.  Forest-to-String SMT for Asian Language Translation: NAIST at WAT 2014 , 2014, WAT.

[2]  Lemao Liu,et al.  Agreement on Target-bidirectional Neural Machine Translation , 2016, NAACL.

[3]  Graham Neubig,et al.  Pointwise Prediction for Robust, Adaptable Japanese Morphological Analysis , 2011, ACL.

[4]  Philipp Koehn,et al.  Findings of the 2018 Conference on Machine Translation (WMT18) , 2018, WMT.

[5]  Hao Wu,et al.  Mixed Precision Training , 2017, ICLR.

[6]  Kenta Oono,et al.  Chainer : a Next-Generation Open Source Framework for Deep Learning , 2015 .

[7]  Myle Ott,et al.  fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.

[8]  Graham Neubig,et al.  Lexicons and Minimum Risk Training for Neural Machine Translation: NAIST-CMU at WAT2016 , 2016, WAT@COLING.

[9]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[10]  Marcin Junczys-Dowmunt,et al.  Microsoft’s Submission to the WMT2018 News Translation Task: How I Learned to Stop Worrying and Love the Data , 2018, WMT.

[11]  Taku Kudo,et al.  SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.

[12]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[13]  Sadao Kurohashi,et al.  Kyoto University Participation to WAT 2017 , 2017, WAT@IJCNLP.

[14]  Masaaki Nagata,et al.  NTT's Neural Machine Translation Systems for WMT 2018 , 2018, WMT.

[15]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[16]  Yoshimasa Tsuruoka,et al.  Character-based Decoding in Tree-to-Sequence Attention-based Neural Machine Translation , 2016, WAT@COLING.

[17]  Matt Post,et al.  A Call for Clarity in Reporting BLEU Scores , 2018, WMT.

[18]  Myle Ott,et al.  Scaling Neural Machine Translation , 2018, WMT.

[19]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[20]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[21]  Graham Neubig,et al.  compare-mt: A Tool for Holistic Comparison of Language Generation Systems , 2019, NAACL.

[22]  Rico Sennrich,et al.  Edinburgh Neural Machine Translation Systems for WMT 16 , 2016, WMT.

[23]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[24]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[25]  Toshiaki Nakazawa,et al.  ASPEC: Asian Scientific Paper Excerpt Corpus , 2016, LREC.

[26]  Christopher D. Manning,et al.  Stanford Neural Machine Translation Systems for Spoken Language Domains , 2015, IWSLT.

[27]  Graham Neubig,et al.  Overview of the 5th Workshop on Asian Translation , 2019, PACLIC.

[28]  Víctor M. Sánchez-Cartagena,et al.  Prompsit’s submission to WMT 2018 Parallel Corpus Filtering shared task , 2018, WMT.

[29]  Yang Liu,et al.  Neural Machine Translation with Reconstruction , 2016, AAAI.