A Rule-Based Approach For Pronoun Extraction And Pronoun Mapping In Pronominal Anaphora Resolution Of Quran English Translations

The nature of the Quran and its translations as classic Arabic and English texts reduces the accuracy of ordinary natural language processing tools such as pronominal anaphora resolution systems. Pronominal anaphora resolution simply involves finding an antecedent for anaphoric pronouns as the referring expressions of discourse. The performance of a pronominal anaphora resolution system is vitally related to the efficiency of pre-processing tools that analyze and prepare the input data for feeding the resolution algorithm. This paper proposes a novel pre-processing approach for pronoun extraction and pronoun mapping in the pronominal anaphora resolution system of English translations of the Quran, which facilitates the anaphora resolution, specifically for the English pronouns without an explicit antecedent that contributes close to 50% of the anaphoric relations in the Quran. This approach uses the morphologic, statistic and anaphoric knowledge that is extracted from the Arabic corpus of the Quran. For evaluating the arrangement, 1% of an English translation was annotated with labeling for all anaphoric and non-anaphoric English pronouns. These pronouns were aligned to the equivalent Arabic pronouns and linked to the concepts in the Arabic text. Through statistical results, it was shown that our rule-based pre-processing tools perform well. The precision, recall, and accuracy of pronoun extraction stage are 96.38%, 100%, and 99.5%, respectively. The result of mapping algorithm is promising whereby we score 85.51% in precision, 96.32% in recall, and 82.81% in accuracy.

[1]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[2]  Adnan Abu Mahfouz Some Issues in Translating Nouns in Abdullah Yusuf Ali s Translation of the Meanings of the Holy Quran , 1975 .

[3]  Abdullah Yusuf Ali,et al.  The Holy Qurʾān : Arabic text, English translation & commentary , 2006 .

[4]  Marcello Federico,et al.  Modelling pronominal anaphora in statistical machine translation , 2010, IWSLT.

[5]  Richard Evans,et al.  Anaphora Resolution: To What Extent Does It Help NLP Applications? , 2007, DAARC.

[6]  Richard Evans,et al.  A New, Fully Automatic Version of Mitkov's Knowledge-Poor Pronoun Resolution Method , 2002, CICLing.

[7]  Lamia Hadrich Belguith,et al.  Multilingual Robust Anaphora Resolution , 1998, EMNLP.

[8]  Ruslan Mitkov,et al.  1 Anaphora Resolution : Where Do We Stand Now ? , 2000 .

[9]  Jamal al-Qinai,et al.  Convergence and divergence in the interpretation of Quranic polysemy and lexical recurrence , 2012 .

[10]  Maulawī Sher ʿAlī,et al.  The Holy Qurʾān , 1960 .

[11]  Constantin Orăsan,et al.  The influence of pronominal anaphora resolution on term-based summarisation , 2009 .

[12]  Eric Atwell,et al.  Syntactic Annotation Guidelines for the Quranic Arabic Dependency Treebank , 2010, LREC.

[13]  R. Mahmud,et al.  Issues of coherence analysis on English translations of Quran , 2013, 2013 1st International Conference on Communications, Signal Processing, and their Applications (ICCSPA).

[14]  Shuly Wintner,et al.  Morphological Analysis of the Qur'an , 2004, Lit. Linguistic Comput..

[15]  Abdelmajid Ben Hamadou,et al.  Arabic anaphora resolution: corpora annotation with coreferential links , 2009, Int. Arab J. Inf. Technol..

[16]  Hussein Abdul-Raof,et al.  Qur'an Translation: Discourse, Texture and Exegesis , 2001 .

[17]  R. Mitkov ANAPHORA RESOLUTION: THE STATE OF THE ART , 2007 .

[18]  Branimir Boguraev,et al.  Introduction to the Special Issue on Computational Anaphora Resolution , 2001, CL.

[19]  Vincent Ng,et al.  Anaphora resolution in biomedical literature: a hybrid approach , 2012, BCB.

[20]  Allaoua Refoufi Pronominal Anaphora Resolution Using XML TaggedDocuments , 2014 .

[21]  Ali Farghaly,et al.  Arabic Anaphora Resolution: Corpus of the Holy Qur'an Annotated with Anaphoric Information , 2015 .

[22]  Karel Jezek,et al.  Two uses of anaphora resolution in summarization , 2007, Inf. Process. Manag..

[23]  Tyne Liang,et al.  Automatic Pronominal Anaphora Resolution in English Texts , 2003, ROCLING.

[24]  Shalom Lappin,et al.  An Algorithm for Pronominal Anaphora Resolution , 1994, CL.

[25]  Branimir Boguraev,et al.  Anaphora for Everyone: Pronominal Anaphora Resolution without a Parser , 1996, COLING.

[26]  Eric Atwell,et al.  QurAna: Corpus of the Quran annotated with Pronominal Anaphora , 2012, LREC.

[27]  Ruslan Mitkov,et al.  Robust Pronoun Resolution with Limited Knowledge , 1998, ACL.

[28]  Nizar Habash,et al.  Morphological Annotation of Quranic Arabic , 2010, LREC.

[29]  Tengku Sepora Tengku Mahadi,et al.  Linguistic Ambiguity in the Holy Qur’ān and Its English Translations , 2012 .

[30]  Breck Baldwin,et al.  CogNIAC: high precision coreference with limited knowledge and linguistic resources , 1997 .

[31]  M. Pickthall,et al.  The Meaning of the Glorious Koran , 1930 .

[32]  Ruslan Mitkov,et al.  Evaluation Tool for Rule-based Anaphora Resolution Methods , 2001, ACL.

[33]  Yannick Versley,et al.  BART: A Multilingual Anaphora Resolution System , 2010, *SEMEVAL.

[34]  Michal Novák,et al.  Cross-lingual Coreference Resolution of Pronouns , 2014, COLING.