Document Classification using Naïve Bayes for Indonesian Translation of the Quran

Classification for Indonesian language documents was increased. But the application of classification for question and answer system needs is still few. The purpose of this paper is to maximize the classification of Indonesian documents especially the Qur'an translation to support the question and answer system. In the process of creating a question and answer system that is still ongoing, testing the Naïve Bayes algorithm becomes very important besides other algorithms. The Naïve Bayes method is the first choice in this test as it has practicality in calculating. The result of this study is the classification of ITQ documents with 4 categories: morality, faith, knowledge, and Muamalah. The average accuracy rate of 90.5% indicates that the Naïve Bayes method is still relevant for use.

[1]  Asep Fajar Firmansyah,et al.  Eliminating Unanswered Questions from Question Answering System for Khulafaa Al-Rashidin History , 2016, 2016 6th International Conference on Information and Communication Technology for The Muslim World (ICT4M).

[2]  Darma Putra,et al.  Personality Types Classification for Indonesian Text in Partners Searching Website Using Naïve Bayes Methods , 2013 .

[3]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[4]  Wildan Budiawan Zulfikar,et al.  Klasifikasi Terjemahan Ayat Al-Quran Tentang Ilmu Sains Menggunakan Algoritma Decision Tree Berbasis Mobile , 2016 .

[5]  S. D. Larasati Towards a Semantic Analysis of Bahasa Indonesia for Question Answering , 2007 .

[6]  Asep Fajar Firmansyah,et al.  Generating weighted vector for concepts in indonesian translation of Quran , 2016, iiWAS.

[7]  Hung Hum,et al.  Is Naïve Bayes a Good Classifier for Document Classification , 2011 .

[8]  Arifin,et al.  CLASSIFICATION OF EMOTIONS IN INDONESIAN TEXTSUSING K-NN METHOD , 2012 .

[9]  Khodijah Hulliyah,et al.  A semantic-based question answering system for indonesian translation of Quran , 2016, iiWAS.

[10]  Irina Rish,et al.  An empirical study of the naive Bayes classifier , 2001 .

[11]  Teddy Mantoro,et al.  Segmentation and Classification of Cervical Cells Using Deep Learning , 2019, IEEE Access.

[12]  Jiawei Han,et al.  Data Mining for Web Intelligence , 2002, Computer.

[13]  JOHN B. KILLORAN,et al.  How to Use Search Engine Optimization Techniques to Increase Website Visibility , 2013, IEEE Transactions on Professional Communication.

[14]  David R. Karger,et al.  Tackling the Poor Assumptions of Naive Bayes Text Classifiers , 2003, ICML.

[15]  Teddy Mantoro,et al.  Text mining for Indonesian translation of the Quran: A systematic review , 2017, 2017 International Conference on Computing, Engineering, and Design (ICCED).

[16]  Khairullah Khan,et al.  A Review of Machine Learning Algorithms for Text-Documents Classification , 2010 .

[17]  Syopiansyah Jaya Putra,et al.  Improving the Scoring Process of Question Answering System in Indonesian Language Using Fuzzy Logic , 2018, 2018 International Conference on Information and Communication Technology for the Muslim World (ICT4M).

[18]  Syopiansyah Jaya Putra,et al.  Context for the intelligent search of information , 2017, 2017 5th International Conference on Cyber and IT Service Management (CITSM).

[19]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[20]  Joaquin Quiñonero Candela,et al.  Web-Scale Bayesian Click-Through rate Prediction for Sponsored Search Advertising in Microsoft's Bing Search Engine , 2010, ICML.

[21]  Mohamad Irfan,et al.  The comparation of text mining with Naive Bayes classifier, nearest neighbor, and decision tree to detect Indonesian swear words on Twitter , 2017, 2017 5th International Conference on Cyber and IT Service Management (CITSM).