论文信息 - Modeling hypotactic structure for Chinese-English neural machine translation of complex sentences

Modeling hypotactic structure for Chinese-English neural machine translation of complex sentences

The hypotactic structural relation between clauses plays an important role in improving the discourse coherence of document-level translation. However, the standard neural machine translation (NMT) models do not explicitly model the hypotactic relationship between clauses, which usually leads to structurally incorrect translations of long and complex sentences. This problem is particularly noticeable on Chinese-to-English translation task of complex sentences due to the grammatical form distinction between English and Chinese. English is rich in grammatical form (e.g. verb morphological changes and subordinating conjunctions) while Chinese is poor in grammatical form. These linguistic phenomena make it a challenge for NMT to learn the hypotactic structure knowledge from Chinese as well as the structure alignment between Chinese and English. To address these issues, we propose to model the hypotactic structure for Chinese-to-English complex sentence translation by introducing hypotactic structure knowledge. Specifically, we annotate and build a hypotactic structure aligned parallel corpus that provides rich hypotactic structure knowledge for NMT. Moreover, we further propose a structure-infused neural framework to combine the hypotactic structure knowledge with the NMT model through two integrating strategies. In particular, we introduce a specific structure-aware loss to encourage the NMT model to better learn the structure knowledge. Experimental results on WMT17, WMT18 and WMT19 Chinese-to-English translation tasks demonstrate the effectiveness of the proposed methods.

[1] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[2] Heyan Huang,et al. Improving neural machine translation with sentence alignment learning , 2021, Neurocomputing.

[3] William C. Mann,et al. Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[4] Rico Sennrich,et al. Evaluating Discourse Phenomena in Neural Machine Translation , 2017, NAACL.

[5] Scott Weinstein,et al. Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[6] Ying Ding,et al. Improving neural sentence alignment with word translation , 2020, Frontiers of Computer Science.

[7] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[8] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[9] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[10] Yang Liu,et al. Context Gates for Neural Machine Translation , 2016, TACL.

[11] Hua Wu,et al. Modeling Coherence for Discourse Neural Machine Translation , 2018, AAAI.