An Approach for Arabic Text Categorization Using Association Rule Mining

Text Categorization (TC) has become one of the major techniques for organizing and managing online information. Several studies proposed the so-called associative classification for databases and few of these studies are proposed to classify text documents into predefined categories based on their contents. In this paper a new approach is proposed for Arabic text categorization. The approach facilitates the discovery of association rules for building a classification model for Arabic text categorization. An apriori based algorithm is employed for association rule mining. To validate the proposed approach, several experiments were applied on a collection of Arabic documents. Three classification methods using association rules were compared in terms of their classification accuracy; the methods are: ordered decision list, weighted rules, and majority voting. The results showed that the majority voting method is the best in most of experiments achieving an accuracy of up to 87%. On the other hand, the weigh...

[1]  Alaa M. El-Halees Mining Arabic Association Rules for Text Classification , 2006 .

[2]  Hussein Zedan,et al.  Crime Type Document Classification from Arabic Corpus , 2009, 2009 Second International Conference on Developments in eSystems Engineering.

[3]  Jiawei Han,et al.  CPAR: Classification based on Predictive Association Rules , 2003, SDM.

[4]  Mohamed S. Abdel-Wahab,et al.  An Intelligent System For Arabic Text Categorization , 2006 .

[5]  Stefan Mutter,et al.  Using Classification to Evaluate the Output of Confidence-Based Association Rule Mining , 2004, Australian Conference on Artificial Intelligence.

[6]  R. Al Shalabi,et al.  New approach for extracting Arabic roots , 2003 .

[7]  María N. Moreno,et al.  Association Rules: Problems, solutions and new applications , 2005 .

[8]  Natheer Khasawneh,et al.  Feature reduction techniques for Arabic text categorization , 2009 .

[9]  Michelangelo Ceci,et al.  Spatial associative classification: propositional vs structural approach , 2006, Journal of Intelligent Information Systems.

[10]  Mohammed N. Al-Kabi,et al.  A COMPARATIVE STUDY OF THE EFFICIENCY OF DIFFERENT MEASURES TO CLASSIFY ARABIC TEXT , 2007 .

[11]  Abdulmohsen Al-Thubaity,et al.  Automatic Arabic Text Classification , 2008 .

[12]  Fouzi Harrag,et al.  Improving arabic text categorization using decision trees , 2009, 2009 First International Conference on Networked Digital Technologies.

[13]  Rafael A. Calvo Classifying Financial News With Neural Networks , 2001 .

[14]  Senthamarai Kannan Subramanian,et al.  Automated Classification of Customer Emails via Association Rule Mining , 2007 .

[15]  Osmar R. Zaïane,et al.  Text document categorization by term association , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[16]  Ahmed Ghoneim,et al.  Naive Bayes Classifier based Arabic document categorization , 2010, 2010 The 7th International Conference on Informatics and Systems (INFOS).

[17]  Frans Coenen,et al.  Selection of Significant Rules in Classification Association Rule Mining , 2005 .

[18]  David A. Bell Using kNN Model-based Approach for Automatic Text Categorization , 2006 .

[19]  Nayer M. Wanas,et al.  A Study of Text Preprocessing Tools for Arabic Text Categorization , 2009 .

[20]  Rehab Duwairi,et al.  Machine learning for Arabic text categorization , 2006, J. Assoc. Inf. Sci. Technol..

[21]  Laila Khreisat,et al.  Arabic Text Classification Using N-Gram Frequency Statistics A Comparative Study , 2006, DMIN.

[22]  Jae-Moon Lee,et al.  Managing Content with Automatic Document Classification , 2004, J. Digit. Inf..

[23]  Jiuyong Li,et al.  Optimal and Robust Rule Set Generation , 2002 .

[24]  Amine Bensaid,et al.  Automatic Arabic Document Categorization Based on the Naïve Bayes Algorithm , 2004 .