Classification of Arabic Text Using Singular Value Decomposition and Fuzzy C-Means Algorithms

This paper proposes the use of singular value decomposition and fuzzy c-means algorithms for Arabic text classification. Al Jazeera Arabic news and CNN Arabic news datasets are used to measure the effectiveness of the proposed approach in classifying Arabic texts. The experimental results are compared with four supervised classification methods that have been used in the previous work on the same datasets we used in this research. These include: support vector machine, naive Bayes, decision tree, and polynomial networks. The results proved the effectiveness of the proposed approach compared to recent works in Arabic text classification.

[1]  Fouzi Harrag,et al.  Improving Arabic Text Categorization Using Neural Network with SVD , 2010, J. Digit. Inf. Manag..

[2]  Souad Larabi Marie-Sainte,et al.  Firefly Algorithm based Feature Selection for Arabic Text Classification , 2020, J. King Saud Univ. Comput. Inf. Sci..

[3]  Ibrahim Abu El-Khair,et al.  Effects of Stop Words Elimination for Arabic Information Retrieval: A Comparative Study , 2017, ArXiv.

[4]  B. S. Harish,et al.  Classification of Text Documents Using Adaptive Fuzzy C-Means Clustering , 2013, ISI.

[5]  Fawaz S. Al-Anzi,et al.  Employing fisher discriminant analysis for Arabic text classification , 2017, Comput. Electr. Eng..

[6]  David W. Corne,et al.  Feature subset selection for Arabic document categorization using BPSO-KNN , 2011, 2011 Third World Congress on Nature and Biologically Inspired Computing.

[7]  Fawaz S. Al-Anzi,et al.  Toward an enhanced Arabic text classification using cosine similarity and Latent Semantic Indexing , 2017, J. King Saud Univ. Comput. Inf. Sci..

[8]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[9]  Thaung Thaung Win,et al.  Document clustering by fuzzy c-mean algorithm , 2010, 2010 2nd International Conference on Advanced Computer Control.

[10]  Luciano Fadiga,et al.  Automatic online spike sorting with singular value decomposition and fuzzy C-mean clustering , 2012, BMC Neuroscience.

[11]  Taha Osman,et al.  Challenges in Sentiment Analysis for Arabic Social Networks , 2017, ACLING.

[12]  Mohammed Al-Sarem,et al.  Feature selection using an improved Chi-square for Arabic text classification , 2020, J. King Saud Univ. Comput. Inf. Sci..

[13]  Abdelwadood Moh'd. Mesleh,et al.  Feature sub-set selection metrics for Arabic text classification , 2011, Pattern Recognit. Lett..

[14]  M. Roubens Pattern classification problems and fuzzy sets , 1978 .

[15]  Bustami Yusuf,et al.  Singular Value Decomposition for dimensionality reduction in unsupervised text learning problems , 2010, 2010 2nd International Conference on Education Technology and Computer.

[16]  Juebo Wu,et al.  An Improved Fuzzy Clustering Method for Text Mining , 2010, 2010 Second International Conference on Networks Security, Wireless Communications and Trusted Computing.

[17]  Muhamad Taufik Abdullah,et al.  MALAY DOCUMENTS CLUSTERING ALGORITHM BASED ON SINGULAR VALUE DECOMPOSITION , 2009 .

[18]  Vivek Kumar Singh,et al.  Document Clustering Using K-Means, Heuristic K-Means and Fuzzy C-Means , 2011, 2011 International Conference on Computational Intelligence and Communication Networks.

[19]  Cheng Hua Li,et al.  Neural Network for Text Classification Based on Singular Value Decomposition , 2007, 7th IEEE International Conference on Computer and Information Technology (CIT 2007).

[20]  Simone A. Ludwig MapReduce-based fuzzy c-means clustering algorithm: implementation and scalability , 2015, Int. J. Mach. Learn. Cybern..

[21]  Matsumoto Yuji,et al.  Document Clustering : Before and After the Singular Value Decomposition , 1999 .

[22]  Gene H. Golub,et al.  Singular value decomposition and least squares solutions , 1970, Milestones in Matrix Computation.

[23]  Mayy M. Al-Tahrawi,et al.  Arabic text classification using Polynomial Networks , 2015, J. King Saud Univ. Comput. Inf. Sci..

[24]  Bassam Al-Salemi,et al.  Multi-label Arabic text categorization: A benchmark and baseline comparison of multi-label learning algorithms , 2019, Inf. Process. Manag..

[25]  Fawaz S. Al-Anzi,et al.  Beyond vector space model for hierarchical Arabic text classification: A Markov chain approach , 2018, Inf. Process. Manag..

[26]  Fatma Elghannam,et al.  Text representation and classification based on bi-gram alphabet , 2019, J. King Saud Univ. Comput. Inf. Sci..