NER in english translation of hadith documents using classifiers combination

There is a need to retrieve and extract important information in order to fully understanding the everincreasing volume of English translated Islamic documents available on the web. There is limited research focused on Named Entity Recognition (NER) for Islamic translations even though NER has seen widespread focus in other languages. Translated named entities have their own characteristics and available annotated English corpora do not cover all the transliterated Arabic names, which makes translations with NER difficult in the Islamic domain. This research addressed the use of NER in English translations of Hadith texts. The objective of this research was to design and develop a model that was able to excerpt Named Entities from English translation of Hadith texts. This research used supervised machine learning approaches, like Support Vector Machine (SVM), Maximum Entropy Classifier (ME) and Naive Bayes (NB), which were later combined via majority voting algorithm to identify named entities from Hadith texts. From the results of this research, voting combination approaches outmatched single classifiers with an overall F-measure of 95.3% in identifying named entities. The results indicated that combined models paired with suitable features were better suited to recognize named entities of translated Hadith texts as compared to baseline models.

[1]  Asif Ekbal,et al.  Multiobjective optimization for classifier ensemble and feature selection: an application to named entity recognition , 2011, International Journal on Document Analysis and Recognition (IJDAR).

[2]  Saudi Arabia,et al.  e-NARRATOR - AN APPLICATION FOR CREATING AN ONTOLOGY OF HADITHS NARRATION TREE SEMANTICALLY AND GRAPHICALLY , 2010 .

[3]  Fouzi Harrag Text mining approach for knowledge extraction in Sahîh Al-Bukhari , 2014, Comput. Hum. Behav..

[4]  Fouzi Harrag,et al.  Ontology Extraction Approach for Prophetic Narration (Hadith) using Association Rules , 2013 .

[5]  Naomie Salim,et al.  Methodology of ontology extraction for islamic knowledge text , 2008 .

[6]  Montse Cuadros,et al.  NERC-fr: Supervised Named Entity Recognition for French , 2014, TSD.

[7]  Naomie Salim,et al.  Pattern extraction for Islamic concept , 2009, 2009 International Conference on Electrical Engineering and Informatics.

[8]  Srikanta Patnaik,et al.  A System for Recognition of Named Entities in Odia Text Corpus Using Machine Learning Algorithm , 2015 .

[9]  Asif Ekbal,et al.  Combining multiple classifiers using vote based classifier ensemble technique for named entity recognition , 2013, Data Knowl. Eng..

[10]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[11]  Sivaji Bandyopadhyay,et al.  Voted NER System using Appropriate Unlabeled Data , 2009, NEWS@IJCNLP.

[12]  Sivaji Bandyopadhyay,et al.  Named Entity Recognition using Support Vector Machine: A Language Independent Approach , 2010 .

[13]  Naomie Salim,et al.  A framework for Islamic knowledge via ontology representation , 2010, 2010 International Conference on Information Retrieval & Knowledge Management (CAMP).

[14]  Alejandro Figueroa,et al.  Exploring effective features for recognizing the user intent behind web queries , 2015, Comput. Ind..