Modeling Discourse Structure for Document-level Neural Machine Translation

Recently, document-level neural machine translation (NMT) has become a hot topic in the community of machine translation. Despite its success, most of existing studies ignored the discourse structure information of the input document to be translated, which has shown effective in other tasks. In this paper, we propose to improve document-level NMT with the aid of discourse structure information. Our encoder is based on a hierarchical attention network (HAN) (Miculicich et al., 2018). Specifically, we first parse the input document to obtain its discourse structure. Then, we introduce a Transformer-based path encoder to embed the discourse structure information of each word. Finally, we combine the discourse structure information with the word embedding before it is fed into the encoder. Experimental results on the English-to-German dataset show that our model can significantly outperform both Transformer and Transformer+HAN.

[1]  Andy Way,et al.  Exploiting Cross-Sentence Context for Neural Machine Translation , 2017, EMNLP.

[2]  Masaaki Nagata,et al.  Dependency-based Discourse Parser for Single-Document Summarization , 2014, EMNLP.

[3]  Gholamreza Haffari,et al.  Document Context Neural Machine Translation with Memory Networks , 2017, ACL.

[4]  Yang Liu,et al.  A Hierarchy-to-Sequence Attentional Neural Machine Translation Model , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[5]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[6]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[7]  Kim Schouten,et al.  COMMIT at SemEval-2016 Task 5: Sentiment Analysis with Rhetorical Structure Theory , 2016, SemEval@NAACL-HLT.

[8]  Qun Liu,et al.  Translation Model Adaptation for Statistical Machine Translation with Monolingual Topic Information , 2012, ACL.

[9]  Min Zhang,et al.  Variational Neural Machine Translation , 2016, EMNLP.

[10]  Guodong Zhou,et al.  Stance Detection with Hierarchical Attention Network , 2018, COLING.

[11]  Giuseppe Carenini,et al.  Exploring Joint Neural Model for Sentence Level Discourse Parsing and Sentiment Analysis , 2017, SIGDIAL Conference.

[12]  Houfeng Wang,et al.  A Two-Stage Parsing Method for Text-Level Discourse Analysis , 2017, ACL.

[13]  Xinyan Xiao,et al.  A Topic Similarity Model for Hierarchical Phrase-based Translation , 2012, ACL.

[14]  Yang Feng,et al.  Enhancing Context Modeling with a Query-Guided Capsule Network for Document-level Translation , 2019, EMNLP/IJCNLP.

[15]  Yizhong Wang,et al.  Toward Fast and Accurate Neural Discourse Segmentation , 2018, EMNLP.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[18]  Jörg Tiedemann,et al.  Neural Machine Translation with Extended Context , 2017, DiscoMT@EMNLP.

[19]  Karin Sim Smith On Integrating Discourse in Machine Translation , 2017, DiscoMT@EMNLP.

[20]  James Henderson,et al.  Document-Level Neural Machine Translation with Hierarchical Attention Networks , 2018, EMNLP.

[21]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[22]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[23]  Gholamreza Haffari,et al.  Selective Attention for Context-aware Neural Machine Translation , 2019, NAACL.

[24]  Rico Sennrich,et al.  Context-Aware Neural Machine Translation Learns Anaphora Resolution , 2018, ACL.

[25]  Yang Liu,et al.  Context Gates for Neural Machine Translation , 2016, TACL.

[26]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[27]  Jingbo Zhu,et al.  Document-level Consistency Verification in Machine Translation , 2011, MTSUMMIT.

[28]  Yang Liu,et al.  A Context-Aware Topic Model for Statistical Machine Translation , 2015, ACL.

[29]  Guodong Zhou,et al.  Modeling Coherence for Neural Machine Translation with Dynamic and Topic Caches , 2017, COLING.

[30]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[31]  Shan Wu,et al.  Variational Recurrent Neural Machine Translation , 2018, AAAI.

[32]  Huanbo Luan,et al.  Improving the Transformer Translation Model with Document-Level Context , 2018, EMNLP.

[33]  Ichiro Sakata,et al.  Unsupervised Neural Single-Document Summarization of Reviews via Learning Latent Discourse Structure and its Ranking , 2019, ACL.