Pronominal Anaphora in Machine Translation

State-of-the-art machine translation systems use strong assumptions of independence. Following these assumptions language is split into small segments such as sentences and phrases which are translated independently. Natural language, however, is not independent: many concepts depend on context. One such case is reference introduced by pronominal anaphora. In pronominal anaphora a pronoun word (anaphor) refers to a concept mentioned earlier in the text (antecedent). This type of reference can refer to something in the same sentence, but it can also span many sentences. Pronominal anaphora pose a challenge for translators since the anaphor has to fulfil some grammatical agreement with the antecedent. This means that the reference has to be detected in the source text before translation and the translator needs to ensure that this reference still holds true in the translation. The independence assumptions of current machine translation systems do not allow for this. We study pronominal anaphora in two tasks of English–German machine translation. We analyse occurrence of pronominal anaphora and their current translation performance. In this analysis we find that the implicit handling of pronominal anaphora in our baseline translation system is not sufficient. Therefore we develop four approaches to handle pronominal anaphora explicitly. Two of these approaches are based on post-processing. In the first one we correct pronouns directly and in the second one we select a hypothesis with correct pronouns from the translation system’s n-best list. Both of these approaches improve the translation accuracy of the pronouns but hardly change the translation quality measured in BLEU. The other two approaches predict translations of pronoun words and can be used in the decoder. The Discriminative Word Lexicon (DWL) predicts the probability of a target word to be used in the translation and the Source DWL (SDWL) directly predicts the translation of a source language pronoun. However, these predictions do not improve the quality already achieved by the translation system.

[1]  Helmut Schmid,et al.  Improvements in Part-of-Speech Tagging with an Application to German , 1999 .

[2]  Elena Tognini-Bonelli,et al.  Corpus Linguistics at Work , 2002, Computational Linguistics.

[3]  Jörg Tiedemann,et al.  Feature Weight Optimization for Discourse-Level SMT , 2013, DiscoMT@ACL.

[4]  Liane Guillou,et al.  Improving Pronoun Translation for Statistical Machine Translation , 2012, EACL.

[5]  Christian Hardmeier,et al.  Discourse in Statistical Machine Translation : A Survey and a Case Study , 2012 .

[6]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[7]  Hermann Ney,et al.  Extending Statistical Machine Translation with Discriminative and Trigger-Based Lexicon Models , 2009, EMNLP.

[8]  Kees van Deemter,et al.  On Coreferring: Coreference in MUC and Related Annotation Schemes , 2000, CL.

[9]  Richard Evans,et al.  Coreference Resolution: To What Extent Does It Help NLP Applications? , 2012, TSD.

[10]  Joke Dorrepaal,et al.  Discourse Anaphora , 1990, COLING.

[11]  Andrei Popescu-Belis,et al.  Using Sense-labeled Discourse Connectives for Statistical Machine Translation , 2012, ESIRMT/HyTra@EACL.

[12]  Jörg Tiedemann,et al.  The Uppsala-FBK systems at WMT 2011 , 2011, WMT@EMNLP.

[13]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[14]  R. Ratcliff,et al.  The activation of antecedent information during the processing of anaphoric reference in reading. , 1983 .

[15]  Michal Novák,et al.  Translation of “It” in a Deep Syntax Framework , 2013, DiscoMT@ACL.

[16]  Andrei Popescu-Belis,et al.  Discourse-level Annotation over Europarl for Machine Translation: Connectives and Pronouns , 2012, LREC.

[17]  H.G.A. Hughes,et al.  The Cambridge Encyclopedia of the English Language (2nd edition) , 2004 .

[18]  Jerry R. Hobbs Resolving pronoun references , 1986 .

[19]  Philipp Koehn,et al.  Factored Translation Models , 2007, EMNLP.

[20]  Anthony McEnery,et al.  Corpus-based and computational approaches to discourse anaphora , 2000 .

[21]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[22]  Andrei Popescu-Belis,et al.  Machine Translation of Labeled Discourse Connectives , 2012, AMTA.

[23]  Nicolas Nicolov Book Review: Anaphora Resolution , 2003, IEEE Intell. Informatics Bull..

[24]  Jan Niehues,et al.  An MT Error-Driven Discriminative Word Lexicon using Sentence Structure Features , 2013, WMT@ACL.

[25]  Tat-Seng Chua,et al.  A Public Reference Implementation of the RAP Anaphora Resolution Algorithm , 2004, LREC.

[26]  Marcello Federico,et al.  Modelling pronominal anaphora in statistical machine translation , 2010, IWSLT.

[27]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[28]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[29]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[30]  Shalom Lappin,et al.  An Algorithm for Pronominal Anaphora Resolution , 1994, CL.

[31]  Srinivas Bangalore,et al.  Statistical Machine Translation through Global Lexical Selection and Sentence Reconstruction , 2007, ACL.

[32]  Harold L. Somers,et al.  An introduction to machine translation , 1992 .

[33]  D. Crystal The Cambridge Encyclopedia of the English Language , 1998 .

[34]  Jan Niehues,et al.  The KIT English-French translation systems for IWSLT 2011 , 2011, IWSLT.

[35]  Jörg Tiedemann,et al.  Document-Wide Decoding for Phrase-Based Statistical Machine Translation , 2012, EMNLP.

[36]  Carla Umbach,et al.  Anaphora Resolution in Machine Translation , 1992 .

[37]  M. N ovak Utilization of Anaphora in Machine Translation , 2011 .

[38]  Tyne Liang,et al.  Automatic Pronominal Anaphora Resolution in English Texts , 2003, ROCLING.

[39]  Helmut Schmid,et al.  Estimation of Conditional Probabilities With Decision Trees and an Application to Fine-Grained POS Tagging , 2008, COLING.

[40]  Philip Koehn,et al.  Statistical Machine Translation , 2010, EAMT.