Input Combination Strategies for Multi-Source Transformer Decoder

In multi-source sequence-to-sequence tasks, the attention mechanism can be modeled in several ways. This topic has been thoroughly studied on recurrent architectures. In this paper, we extend the previous work to the encoder-decoder attention in the Transformer architecture. We propose four different input combination strategies for the encoder-decoder attention: serial, parallel, flat, and hierarchical. We evaluate our methods on tasks of multimodal translation and translation with multiple source languages. The experiments show that the models are able to use multiple sources and improve over single source baselines.

[1]  Desmond Elliott,et al.  Multilingual Image Description with Neural Sequence Models , 2015, 1510.04709.

[2]  Kevin Knight,et al.  Multi-Source Neural Translation , 2016, NAACL.

[3]  Rico Sennrich,et al.  Context-Aware Neural Machine Translation Learns Anaphora Resolution , 2018, ACL.

[4]  Desmond Elliott,et al.  Multi-Language Image Description with Neural Sequence Models , 2015, ArXiv.

[5]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[6]  Lukasz Kaiser,et al.  One Model To Learn Them All , 2017, ArXiv.

[7]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[8]  Alon Lavie,et al.  Meteor 1.3: Automatic Metric for Reliable Optimization and Evaluation of Machine Translation Systems , 2011, WMT@EMNLP.

[9]  Desmond Elliott,et al.  Adversarial Evaluation of Multimodal Machine Translation , 2018, EMNLP.

[10]  Jörg Tiedemann,et al.  Parallel Data, Tools and Interfaces in OPUS , 2012, LREC.

[11]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Satoshi Nakamura,et al.  Multi-Source Neural Machine Translation with Missing Data , 2018, NMT@ACL.

[14]  Jindrich Libovický,et al.  Attention Strategies for Multi-Source Sequence-to-Sequence Learning , 2017, ACL.

[15]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[16]  Desmond Elliott,et al.  Imagination Improves Multimodal Translation , 2017, IJCNLP.

[17]  Lior Wolf,et al.  Using the Output Embedding to Improve Language Models , 2016, EACL.

[18]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Joost van de Weijer,et al.  Does Multimodality Help Human and Machine for Translation and Image Captioning? , 2016, WMT.

[20]  Khalil Sima'an,et al.  A Shared Task on Multimodal Machine Translation and Crosslingual Image Description , 2016, WMT.

[21]  Jindrich Libovický,et al.  Neural Monkey: An Open-source Tool for Sequence Learning , 2017, Prague Bull. Math. Linguistics.

[22]  Sadao Kurohashi,et al.  Enabling Multi-Source Neural Machine Translation By Concatenating Source Sentences In Multiple Languages , 2017, MTSUMMIT.

[23]  Joost van de Weijer,et al.  LIUM-CVC Submissions for WMT18 Multimodal Translation Task , 2018, WMT.

[24]  Jindřich Helcl,et al.  CUNI System for the WMT18 Multimodal Translation Task , 2018, WMT.