Arabic Text Categorization Using Support vector machine, Naïve Bayes and Neural Network

Text classification is a very important area in information retrieval. Text classification techniques used to classify documents into a set of predefined categories. There are several techniques and methods used to classify data and in fact there are many researches talks about English text classification. Unfortunately, few researches talks about Arabic text classification. This paper talks about three well-known techniques used to classify data. These three well-known techniques are applied on Arabic data set. A comparative study is made between these three techniques. Also this study used fixed number of documents for all categories of documents in training and testing phase. The result shows that the Support Vector machine gives the best results.

[1]  Tarek F. Gharib,et al.  Arabic Text Classification Using Support Vector Machines , 2009, Int. J. Comput. Their Appl..

[2]  R. Al Shalabi,et al.  New approach for extracting Arabic roots , 2003 .

[3]  Mofleh Al-Diabat,et al.  Arabic Text Categorization Using Classification Rule Mining , 2012 .

[4]  Mahmoud Ahmed,et al.  Arabic Text Classification review , 2015 .

[5]  Abdelwadood Mesleh,et al.  Chi Square Feature Extraction Based Svms Arabic Language Text Categorization System , 2007 .

[6]  Riyad Al-Shalabi,et al.  A Computational Morphology System for Arabic , 1998, SEMITIC@COLING.

[7]  Wei-Ying Ma,et al.  OCFS: optimal orthogonal centroid feature selection for text categorization , 2005, SIGIR '05.

[8]  P. Gahinet,et al.  1995 , 2018, Syria 1975/76-2018.

[9]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[10]  Motaz Saad,et al.  Arabic text classification using decision trees , 2010 .

[11]  Xin Li,et al.  An Optimal SVM-Based Text Classification Algorithm , 2006, 2006 International Conference on Machine Learning and Cybernetics.

[12]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[13]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[14]  Maya Ingle,et al.  Empirical Studies on Machine Learning Based Text Classification Algorithms , 2011 .

[15]  Fouzi Harrag,et al.  Improving Arabic Text Categorization Using Neural Network with SVD , 2010, J. Digit. Inf. Manag..

[16]  Aymen Abu-Errub,et al.  Arabic Text Classification Algorithm using TFIDF and Chi Square Measurements , 2014 .

[17]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[18]  Amine Bensaid,et al.  Automatic Arabic Document Categorization Based on the Naïve Bayes Algorithm , 2004 .

[19]  Fawaz A. Al Zaghoul,et al.  Arabic Text Classification Based on Features Reduction Using Artificial Neural Networks , 2013, 2013 UKSim 15th International Conference on Computer Modelling and Simulation.

[20]  Tarek A. El-Sadany,et al.  An Arabic Morphological System , 1989, IBM Syst. J..

[21]  José Ranilla,et al.  Measures of Rule Quality for Feature Selection in Text Categorization , 2003, IDA.

[22]  Raed Abu Zitar,et al.  Spam Detection Using Genetic Assisted Artificial Immune System , 2011, Int. J. Pattern Recognit. Artif. Intell..

[23]  William Stafford Noble,et al.  Support vector machine , 2013 .

[24]  Johannes Fürnkranz,et al.  A Study Using $n$-gram Features for Text Categorization , 1998 .

[25]  Saleh Alsaleem,et al.  Automated Arabic Text Categorization Using SVM and NB , 2011, Int. Arab. J. e Technol..

[26]  Maureen Caudill,et al.  Understanding Neural Networks; Computer Explorations , 1992 .

[27]  Abdulmohsen Al-Thubaity,et al.  Automatic Arabic Text Classification , 2008 .

[28]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[29]  Abdelwadood Moh'd. Mesleh Support Vector Machines based Arabic Language Text Classification System: Feature Selection Comparative Study , 2007, SCSS.

[30]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[31]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorisation: a survey , 1999 .

[32]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[33]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[34]  Fabrizio Sebastiani,et al.  Supervised term weighting for automated text categorization , 2003, SAC '03.

[35]  Rehab Duwairi,et al.  Machine learning for Arabic text categorization , 2006, J. Assoc. Inf. Sci. Technol..

[36]  Raed Abu Zitar,et al.  Application of genetic optimized artificial immune system and neural networks in spam detection , 2011, Appl. Soft Comput..

[37]  Wenqian Shang,et al.  A novel feature selection algorithm for text categorization , 2007, Expert Syst. Appl..

[38]  Rehab Duwairi,et al.  Educative and Adaptive System for Personalized Learning: Learning Styles and Content Adaptation , 2007 .

[39]  Taisir Eldos,et al.  Arabic Text Data Mining: a Root-Based Hierarchical Indexing Model , 2003 .

[40]  Xiao-Jing Wang,et al.  A new approach to feature selection in text classification , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[41]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection , 1998 .

[42]  Thomas J. Watson,et al.  An empirical study of the naive Bayes classifier , 2001 .