Lexical Chain Based Cohesion Models for Document-Level Statistical Machine Translation

Lexical chains provide a representation of the lexical cohesion structure of a text. In this paper, we propose two lexical chain based cohesion models to incorporate lexical cohesion into document-level statistical machine translation: 1) a count cohesion model that rewards a hypothesis whenever a chain word occurs in the hypothesis, 2) and a probability cohesion model that further takes chain word translation probabilities into account. We compute lexical chains for each source document to be translated and generate target lexical chains based on the computed source chains via maximum entropy classifiers. We then use the generated target chains to provide constraints for word selection in document-level machine translation through the two proposed lexical chain based cohesion models. We verify the effectiveness of the two models using a hierarchical phrase-based translation system. Experiments on large-scale training data show that they can substantially improve translation quality in terms of BLEU and that the probability cohesion model outperforms previous models based on lexical cohesion devices.

[1]  Jörg Tiedemann,et al.  Document-Wide Decoding for Phrase-Based Statistical Machine Translation , 2012, EMNLP.

[2]  Jingbo Zhu,et al.  Document-level Consistency Verification in Machine Translation , 2011, MTSUMMIT.

[3]  Qun Liu,et al.  Modeling Lexical Cohesion for Document-Level Machine Translation , 2013, IJCAI.

[4]  Hwee Tou Ng,et al.  Word Sense Disambiguation Improves Statistical Machine Translation , 2007, ACL.

[5]  Mirella Lapata,et al.  Modeling Local Coherence: An Entity-Based Approach , 2005, ACL.

[6]  Michael Halliday,et al.  Cohesion in English , 1976 .

[7]  Beata Beigman Klebanov,et al.  Associative Texture Is Lost In Translation , 2013, DiscoMT@ACL.

[8]  Graeme Hirst,et al.  Lexical Cohesion Computed by Thesaural relations as an indicator of the structure of text , 1991, CL.

[9]  Deyi Xiong,et al.  A Topic-Based Coherence Model for Statistical Machine Translation , 2013, AAAI.

[10]  Liane Guillou,et al.  Analysing Lexical Consistency in Translation , 2013, DiscoMT@ACL.

[11]  Guodong Zhou,et al.  Cache-based Document-level Statistical Machine Translation , 2011, EMNLP.

[12]  Muriel Vasconcellos,et al.  Cohesion and coherence in the presentation of machine translation products , 1989 .

[13]  David Yarowsky,et al.  One Sense Per Discourse , 1992, HLT.

[14]  Marine Carpuat,et al.  Improving Statistical Machine Translation Using Word Sense Disambiguation , 2007, EMNLP.

[15]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[16]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[17]  Marine Carpuat,et al.  One Translation Per Discourse , 2009, SEW@NAACL-HLT.

[18]  Jörg Tiedemann,et al.  Context Adaptation in Statistical Machine Translation Using Models with Exponentially Decaying Cache , 2010, ACL 2010.

[19]  Ani Nenkova,et al.  A Coherence Model Based on Syntactic Patterns , 2012, EMNLP.

[20]  Alon Lavie,et al.  Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability , 2011, ACL.

[21]  Marine Carpuat,et al.  How phrase sense disambiguation outperforms word sense disambiguation for statistical machine translation , 2007, TMI.

[22]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[23]  Wanxiang Che,et al.  LTP: A Chinese Language Technology Platform , 2010, COLING.

[24]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[25]  Chunyu Kit,et al.  Extending Machine Translation Evaluation Metrics with Lexical Cohesion to Document Level , 2012, EMNLP.

[26]  Kathleen McKeown,et al.  Improving Word Sense Disambiguation in Lexical Chaining , 2003, IJCAI.

[27]  Douglas W. Oard,et al.  Encouraging Consistent Translation Choices , 2012, NAACL.