Computational and natural language processing based studies of hadith literature: a survey

Hadith is one of the most celebrated resources of Classical Arabic text. The hadiths, or Prophetic traditions (tradition for short), are narrations originating from the sayings and conduct of Prophet Muhammad. For Muslims, hadiths are the second most important source of Islamic jurisprudence after the Holy Qur’an. Each hadith consists of two parts, isnad and matn. Matn represents the actual text of the hadith, while isnad unwinds the chain of the authorities which precede and introduce the matn, the succession of people through whose channel the hadith reaches the last transmitter. The hadith corpus is huge and runs into hundreds of volumes. It has an even larger supporting work, e.g., commentaries, biographic material etc. Recently, there has been a renewed interest of this important subject by non-specialists. There are many research studies which have been published regarding hadith, specifically applying computational and natural language processing (NLP) techniques to help address some of the outstanding issues, or derive new insight into this classic resource. This paper surveys all major works that have addressed the subject of hadith through various computational and NLP methods, grouping them under three categories: hadith content-based studies, narration-based studies, and overall studies. We also take an in-depth look into pioneering works with many details appearing for the first time. Finally, we outline future research directions in Arabic hadith literature, including novel application of emerging natural language concept based sentiment and emotion mining techniques.

[1]  John Alden Williams,et al.  Studies in Arabic Literary Papyri. II: Qur'ānic Commentary and TraditionsStudies in Arabic Literary Papyri. II: Qur'anic Commentary and Traditions , 1973 .

[2]  Muḥammad Muṣṭafá Aʿẓamī Studies in Hadīth methodology and literature , 1977 .

[3]  James A. Bellamy,et al.  Two Pre-Islamic Arabic Inscriptions Revised: Jabal Ramm and Umm Al-Jimāl@@@Two Pre-Islamic Arabic Inscriptions Revised: Jabal Ramm and Umm Al-Jimal , 1988 .

[4]  M. M. Al-Azami A NOTE ON WORK IN PROGRESS ON COMPUTERIZATION OF HADITH , 1991 .

[5]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[6]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[7]  Peter Linz,et al.  An Introduction to Formal Languages and Automata , 1997 .

[8]  Marti A. Hearst Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[9]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[10]  H. Tolba,et al.  Automated Narrator Chain Validation for Hadith Studies: Mathematical Modeling and Artificial Intelligence Implementation , 2000, Egypt. Comput. Sci. J..

[11]  Freddy Y. Y. Choi Advances in domain independent linear text segmentation , 2000, ANLP.

[12]  Recep Şentürk Narrative Social Structure: Anatomy of the Hadith Transmission Network, 610-1505 , 2005 .

[13]  Riyad Al-Shalabi,et al.  Al-Hadith Text Classifier , 2005 .

[14]  Mohammed N. Al-Kabi,et al.  A COMPARATIVE STUDY OF THE EFFICIENCY OF DIFFERENT MEASURES TO CLASSIFY ARABIC TEXT , 2007 .

[15]  Susan Hockey,et al.  The History of Humanities Computing , 2007 .

[16]  Mohsen Kahani,et al.  Fuzzy Expert System In Determining Hadith1 Validity , 2007, SCSS.

[17]  Fouzi Harrag,et al.  Information Retrieval Architecture for "Hadith" Text Mining , 2008, J. Digit. Inf. Manag..

[18]  Jonathan A. C. Brown,et al.  How We Know Early Hadīth Critics Did Matn Criticism and Why It's So Hard to Find , 2008 .

[19]  John Unsworth,et al.  A Companion to Digital Humanities , 2008 .

[20]  E. El-Qawasmeh,et al.  Vector space model for Arabic information retrieval — application to “Hadith” indexing , 2008, 2008 First International Conference on the Applications of Digital Information and Web Technologies (ICADIWT).

[21]  Khaled Shaalan,et al.  Arabic Natural Language Processing: Challenges and Solutions , 2009, TALIP.

[22]  Khaled Shaalan,et al.  NERA: Named Entity Recognition for Arabic , 2009, J. Assoc. Inf. Sci. Technol..

[23]  Fouzi Harrag,et al.  Improving arabic text categorization using decision trees , 2009, 2009 First International Conference on Networked Digital Technologies.

[24]  Fouzi Harrag,et al.  Neural Network for Arabic text classification , 2009, 2009 Second International Conference on the Applications of Digital Information and Web Technologies.

[25]  Fouzi Harrag,et al.  Experiments in Improvement of Arabic Information Retrieval , 2009 .

[26]  Naomie Salim,et al.  Islamic knowledge ontology creation , 2009, 2009 International Conference for Internet Technology and Secured Transactions, (ICITST).

[27]  Khaled Shaalan,et al.  NERA: Named Entity Recognition for Arabic , 2009 .

[28]  Zainab Abu Bakar,et al.  Query expansion using thesaurus in improving Malay Hadith retrieval system , 2010, 2010 International Symposium on Information Technology.

[29]  King Abdullah,et al.  Knowledge Discovery in Al-Hadith Using Text Classification Algorithm , 2010 .

[30]  Fouzi Harrag,et al.  Comparing Dimension Reduction Techniques for Arabic Text Classification Using BPNN Algorithm , 2010, 2010 First International Conference on Integrated Intelligent Computing.

[31]  Naomie Salim,et al.  A framework for Islamic knowledge via ontology representation , 2010, 2010 International Conference on Information Retrieval & Knowledge Management (CAMP).

[32]  Ibrahim Bounhas,et al.  Toward a computer study of the reliability of Arabic stories , 2010 .

[33]  Tarek Helmy,et al.  Intelligent Agent for Information Extraction from Arabic Text without Machine Translation , 2010 .

[34]  Zainuddin Hassan,et al.  Adopting Hadith Verification Techniques in to Digital Evidence Authentication , 2010 .

[35]  Saudi Arabia,et al.  e-NARRATOR - AN APPLICATION FOR CREATING AN ONTOLOGY OF HADITHS NARRATION TREE SEMANTICALLY AND GRAPHICALLY , 2010 .

[36]  Akram M. Zeki,et al.  Datamining and Islamic knowledge extraction: alhadith as a knowledge resource , 2010, Proceeding of the 3rd International Conference on Information and Communication Technology for the Moslem World (ICT4M) 2010.

[37]  Aqil Azmi,et al.  iTree - Automating the construction of the narration tree of Hadiths (Prophetic Traditions) , 2010, Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010).

[38]  Marco Boella,et al.  The SALAH Project: Segmentation and Linguistic Analysis of ḥadīṯ Arabic Texts , 2011, AIRS.

[39]  Mohammed Q. Shatnawi,et al.  Verification Hadith Correctness in Islamic Web Pages Using Information Retrieval Techniques , 2011 .

[40]  Zarina Shukur,et al.  Visualization of the Hadith Chain of Narrators , 2011, IVIC.

[41]  AbdulMalik S. Al-Salman,et al.  Extracting Named Entities from Prophetic Narration Texts (Hadith) , 2011, ICSECS.

[42]  Kashif Bilal,et al.  Muhadith: A Cloud Based Distributed Expert System for Classification of Ahadith , 2012, 2012 10th International Conference on Frontiers of Information Technology.

[43]  Akram M. Zeki,et al.  Knowledge Extraction In Hadith Using Data Mining Technique , 2012 .

[44]  R. M. Rias,et al.  M-Hadith: Retrieving Malay Haditli text in a mobile application , 2012, 2012 International Symposium on Computer Applications and Industrial Electronics (ISCAIE).

[45]  Fadi A. Zaraket,et al.  Arabic Cross-Document NLP for the Hadith and Biography Literature , 2012, FLAIRS Conference.

[46]  Jonathan Reck,et al.  The history of the qur'anic text , 2012 .

[47]  Halim Sayoud,et al.  Author discrimination between the Holy Quran and Prophet's statements , 2012, Lit. Linguistic Comput..

[48]  Aqil M. Azmi,et al.  Mining and Visualizing the Narration Tree of Hadiths (Prophetic Traditions) , 2012 .

[49]  A. M. Zeki,et al.  Novel Mechanism to Improve Hadith Classifier Performance , 2012, 2012 International Conference on Advanced Computer Science Applications and Technologies (ACSAT).

[50]  Kawther Aldhaln,et al.  Improving knowledge extraction of Hadith classifier using decision tree algorithm , 2012, 2012 International Conference on Information Retrieval & Knowledge Management.

[51]  T. M. T. Sembok,et al.  2D visualization of terms and documents in Malay language , 2013, 2013 5th International Conference on Information and Communication Technology for the Muslim World (ICT4M).

[52]  Fouzi Harrag,et al.  Ontology Extraction Approach for Prophetic Narration (Hadith) using Association Rules , 2013 .

[53]  Yehya M. Dlloul An Ontology-Based Approach to Support the Process of Judging Hadith Isnad , 2013 .

[54]  Agung Toto Wibowo,et al.  Indonesian Hadith Retrieval System using thesaurus , 2013, 2013 International Conference on Computer, Control, Informatics and Its Applications (IC3INA).

[55]  Halim Sayoud Automatic authorship classification of two ancient books: Quran and Hadith , 2014, 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA).

[56]  Moath M. Najeeb,et al.  Towards Innovative System for Hadith Isnad Processing , 2014 .

[57]  Rebhi S. Baraka,et al.  Building Hadith Ontology to Support the Authenticity of Isnad , 2014 .

[58]  Ibrahim Bounhas,et al.  Toward an Arabic Ontology for Arabic Word Sense Disambiguation Based on Normalized Dictionaries , 2014, OTM Workshops.

[59]  Aqil M. Azmi,et al.  A calligraphic based scheme to justify Arabic text improving readability and comprehension , 2014, Comput. Hum. Behav..

[60]  Narjès Bellamine Ben Saoud,et al.  Improving Arabic Texts Morphological Disambiguation Using a Possibilistic Classifier , 2014, NLDB.

[61]  Zainab Abu Bakar,et al.  Using Topic Analysis for Querying Halal Information on Malay Documents , 2014 .

[62]  Mostafa E. Saleh,et al.  Extraction and Visualization of the Chain of Narrators from Hadiths using Named Entity Recognition and Classification , 2014 .

[63]  AbdulMalik Al-Salman,et al.  Towards Ontology Construction from Arabic Texts-A Proposed Framework , 2014, CIT.

[64]  Fouzi Harrag Text mining approach for knowledge extraction in Sahîh Al-Bukhari , 2014, Comput. Hum. Behav..

[65]  Erik Cambria,et al.  Sentiment Data Flow Analysis by Means of Dynamic Linguistic Patterns , 2015, IEEE Computational Intelligence Magazine.

[66]  E. Cambria,et al.  Sentic Computing , 2015, Socio-Affective Computing.

[67]  Moath M. Najeeb Multi-Agent System for Hadith Processing , 2015 .

[68]  Erik Cambria,et al.  Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis , 2015 .

[69]  Habib Hamam,et al.  Data mining in Sciences of the prophet’s tradition in general and in impeachment and amendment in particular , 2015 .

[70]  Izzat Alsmadi,et al.  Extended Topical Classification of Hadith Arabic Text , 2015 .

[71]  Nursyahidah Alias,et al.  An identification of authentic narrator's name features in Malay hadith texts , 2015, 2015 IEEE Conference on Open Systems (ICOS).

[72]  Nasiroh Omar,et al.  A Parallel Latent Semantic Indexing (LSI) Algorithm for Malay Hadith Translated Document Retrieval , 2015, SCDS.

[73]  Erik Cambria,et al.  Sentic Computing , 2015, Cognitive Computation.

[74]  Mohammed al-Masri,et al.  An Ontology Based Approach to Enhance Information Retrieval from Al-Shamelah Digital Library , 2015 .

[75]  Ibrahim Bounhas,et al.  Comparing Arabic NLP tools for Hadith Classification , 2015 .

[76]  Addin Osman,et al.  A Lexicon for Hadith Science Based on a Corpus , 2015 .

[77]  Ismailcem Budak Arpinar,et al.  Semantic Hadith: Leveraging Linked Data Opportunities for Islamic Knowledge , 2016, LDOW@WWW.

[78]  Ahmad Y. A. Hawalah,et al.  Multilingual Sentiment Analysis: State of the Art and Independent Comparison of Techniques , 2016, Cognitive Computation.

[79]  Mohammed Mourchid,et al.  Classification of Hadiths using LVQ based on VSM Considering Words Order , 2016 .

[80]  Abdullah Gani,et al.  Hadith data mining and classification: a comparative analysis , 2016, Artificial Intelligence Review.

[81]  Eric Atwell,et al.  Concept Search Tool for Multilingual Hadith Corpus , 2016 .

[82]  Moath M. Najeeb XML database for Hadith and narrators , 2016 .

[83]  Halim Sayoud,et al.  Effect of the Text Size on Stylometry - Application on Arabic Religious Texts , 2016, ICCSAMA.

[84]  Eric Atwell,et al.  Design Requirements for Multilingual Hadith Corpus , 2016 .

[85]  Saidah Saad,et al.  NER in english translation of hadith documents using classifiers combination , 2016 .

[86]  Zuraini Zainol,et al.  Keyword based Clustering Technique for Collections of Hadith Chapters , 2016 .

[87]  Iyad AlAgha,et al.  An Ontology Based Approach to Enhance Information Retrieval from Al-Shamelah Digital Librar , 2016 .

[88]  Björn W. Schuller,et al.  New avenues in knowledge bases for natural language processing , 2016, Knowl. Based Syst..

[89]  Muna Al-Razgan,et al.  TibbOnto: Knowledge Representation of Prophet Medicine (Tibb Al-Nabawi) , 2016 .

[90]  Saidah Saad,et al.  Question answering system supporting vector machine method for hadith domain , 2017 .

[91]  Nursyahidah Alias,et al.  Comparative Study of Machine Learning Approach on Malay Translated Hadith Text Classification based on Sanad , 2017 .

[92]  Erik Cambria,et al.  A review of affective computing: From unimodal analysis to multimodal fusion , 2017, Inf. Fusion.

[93]  Khaled Shaalan,et al.  A Rich Arabic WordNet Resource for Al-Hadith Al-Shareef , 2017, ACLING.

[94]  Aqil M. Azmi,et al.  Universal web accessibility and the challenge to integrate informal Arabic users: a case study , 2018, Universal Access in the Information Society.

[95]  Erik Cambria,et al.  Multimodal Sentiment Analysis , 2018, Socio-Affective Computing.

[96]  Erik Cambria,et al.  Sentic LSTM: a Hybrid Network for Targeted Aspect-Based Sentiment Analysis , 2018, Cognitive Computation.

[97]  Amir Hussain,et al.  Applications of Deep Learning and Reinforcement Learning to Biological Data , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[98]  Erik Cambria,et al.  Semi-supervised learning for big social data analysis , 2018, Neurocomputing.

[99]  Yang Li,et al.  Learning multi-grained aspect target sequence for Chinese sentiment analysis , 2018, Knowl. Based Syst..

[100]  David Vilares,et al.  BabelSenticNet: A Commonsense Reasoning Framework for Multilingual Sentiment Analysis , 2018, 2018 IEEE Symposium Series on Computational Intelligence (SSCI).

[101]  Ibrahim Bounhas,et al.  On the Usage of a Classical Arabic Corpus as a Language Resource , 2019, ACM Trans. Asian Low Resour. Lang. Inf. Process..