At present many experts in the field of information technology have designed and developed algorithms to solve stemming problems, especially in Arabic. But of the many stemming analyses in Arabic, there is no standardization of a good stemming algorithm in analysing the accuracy of the text in the Koran. The development of stemming in the Koran is significant to work because it supports the Sharaf classification in the Koran to understand the meaning of every word in the Qur'an. One stemmer or stemming an algorithm to find the primary form of an Arabic word is the Khoja Stemmer algorithm. The way of working from Khoja Stemmer is to try to find the root of an Arabic word by removing the longest prefix and the longest suffix of a word, then try to determine the root of the remaining words using the root word dictionary. In this study, the Khoja Stemmer built was able to calculate the average stemming of the Koran by 95.295%. But the root words produced by Khoja Stemmer if manually checked, there are still several errors. Thus, an Al-Quran dictionary is needed to analyse each stemming result conducted by Khoja stemmer in stemming the Koran.
[1]
Hanane Froud,et al.
A comparative study of root-based and stem-based approaches for measuring the similarity between arabic words for arabic text mining applications
,
2012
.
[2]
Farhad Oroumchian,et al.
Corpus-Based Arabic Stemming Using N-Grams
,
2010,
AIRS.
[3]
Jessica Lin,et al.
Towards an error-free Arabic stemming
,
2008,
iNEWS '08.
[4]
Masnizah Mohd,et al.
Enhanced Arabic Information Retrieval: Light Stemming and Stop Words
,
2013,
M-CAIT.
[5]
Eric Atwell,et al.
Comparative Evaluation of Arabic Language Morphological Analysers and Stemmers
,
2008,
COLING.
[6]
Riyad Al-Shalabi,et al.
Building an effective rule-based light stemmer for Arabic language to inprove search effectiveness
,
2008,
2008 International Conference on Innovations in Information Technology.