Comparative Analysis for Arabic Sentiment Classification

Sentiment analysis categorizes human opinions, emotions and reactions extracted from text into positive or negative polarity. However, mining sentiments from the Arabic text is challenging due to the scarcity of Arabic datasets for training the context. To address this gap, this study builds an Arabic sentiment dataset sourced from tweets, product reviews, hotel reviews, movie reviews, product attraction, and restaurant reviews from different websites; manually labeled for training the sentiment analysis model. The dataset is then used in a comparative experiment with three machine learning algorithms, which are Support Vector Machine (SVM), Naive Bayes (NB), and Decision Tree (DT) via a classification methodology. The best results for polarity prediction in sentiment analysis models was achieved by SVM with product attraction dataset, with the accuracy of 0.96, precision of 0.99, recall of 0.99, and F-measure of 0.98. This is followed by the average performance from NB and DT. It can be concluded that the ML classifiers need the right morphological features to enhance the classification accuracy when dealing with different words that play different roles in the sentence with the same letters.

[1]  Hamido Fujita,et al.  A hybrid approach to the sentiment analysis problem at the sentence level , 2016, Knowl. Based Syst..

[2]  Nursal Arici,et al.  Sentiment Analysis of Iraqi Arabic Dialect on Facebook Based on Distributed Representations of Documents , 2019, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[3]  Mahmoud Al-Ayyoub,et al.  Arabic sentiment analysis: Lexicon-based and corpus-based , 2013, 2013 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT).

[4]  Saud Saleh Alotaibi,et al.  Sentiment analysis in the Arabic language using machine learning , 2015 .

[5]  Mahmoud Al-Ayyoub,et al.  Deep Recurrent neural network vs. support vector machine for aspect-based sentiment analysis of Arabic hotels' reviews , 2017, J. Comput. Sci..

[6]  V. Dhanalakshmi,et al.  Opinion mining from student feedback data using supervised learning algorithms , 2016, 2016 3rd MEC International Conference on Big Data and Smart City (ICBDSC).

[7]  Huy Nguyen,et al.  Twitter Sentiment Analysis Using Machine Learning Techniques , 2020, ICCSAMA.

[8]  R. Rajasree,et al.  Sentiment analysis in twitter using machine learning techniques , 2013, 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT).

[9]  Samhaa R. El-Beltagy,et al.  Building Large Arabic Multi-domain Resources for Sentiment Analysis , 2015, CICLing.

[10]  Muhammad Abdul-Mageed,et al.  AWATIF: A Multi-Genre Corpus for Modern Standard Arabic Subjectivity and Sentiment Analysis , 2012, LREC.

[11]  Hazem M. Hajj,et al.  Sentence-Level and Document-Level Sentiment Mining for Arabic Texts , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[12]  Ebru Akcapinar Sezer,et al.  Assessment of Feature Selection Metrics for Sentiment Analyses: Turkish Movie Reviews , 2014 .

[13]  El-Sayed M. El-Alfy,et al.  Emoji-Based Sentiment Analysis of Arabic Microblogs Using Machine Learning , 2018, 2018 21st Saudi Computer Society National Computer Conference (NCC).

[14]  Muhammad Badruddin Khan,et al.  Identifying comparative opinions in Arabic text in social media using machine learning techniques , 2019, SN Applied Sciences.

[15]  A. Shoukry,et al.  Sentence-level Arabic sentiment analysis , 2012, 2012 International Conference on Collaboration Technologies and Systems (CTS).

[16]  Hend Suliman Al-Khalifa,et al.  Subjectivity and sentiment analysis of Arabic: Trends and challenges , 2014, 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA).

[17]  Geetika Gautam,et al.  Sentiment analysis of twitter data using machine learning approaches and semantic analysis , 2014, 2014 Seventh International Conference on Contemporary Computing (IC3).

[18]  Ashraf Elnagar,et al.  An Annotated Huge Dataset for Standard and Colloquial Arabic Reviews for Subjective Sentiment Analysis , 2018, ACLING.

[19]  Mazin Abed Mohammed,et al.  Implementing an Agent-based Multi-Natural Language Anti-Spam Model , 2018, 2018 International Symposium on Agent, Multi-Agent Systems and Robotics (ISAMSR).

[20]  Khaled Shaalan,et al.  Sentiment Analysis in Arabic , 2015, NLDB.

[21]  Luis Alfonso Ureña López,et al.  OCA: Opinion corpus for Arabic , 2011, J. Assoc. Inf. Sci. Technol..

[22]  Lei Zhang,et al.  Combining lexicon-based and learning-based methods for twitter sentiment analysis , 2011 .

[23]  Sarah O. Alhumoud,et al.  Survey on Arabic Sentiment Analysis in Twitter , 2015 .

[24]  Vimala Balakrishnan,et al.  Sentiment analysis algorithms: evaluation performance of the Arabic and English language , 2018, 2018 International Conference on Computer, Control, Electrical, and Electronics Engineering (ICCCEEE).

[25]  Mohamed Medhat Gaber,et al.  SA-E: Sentiment Analysis for Education , 2013 .

[26]  Amir F. Atiya,et al.  LABR: A Large Scale Arabic Book Reviews Dataset , 2013, ACL.

[27]  Aida Mustapha,et al.  An Anti-Spam Detection Model for Emails of Multi-Natural Language , 2019, Journal of Southwest Jiaotong University.

[28]  Houda Benbrahim,et al.  Some methods to address the problem of unbalanced sentiment classification in an arabic context , 2012, 2012 Colloquium in Information Science and Technology.