Arabic Tweets Sentimental Analysis Using Machine Learning

The continuous rapid growth of electronic Arabic contents in social media channels and in Twitter particularly poses an opportunity for opinion mining research. Nevertheless, it is hindered by either the lack of sentimental analysis resources or Arabic language text analysis challenges. This study introduces an Arabic Jordanian twitter corpus where Tweets are annotated as either positive or negative. It investigates different supervised machine learning sentiment analysis approaches when applied to Arabic user’s social media of general subjects that are found in either Modern Standard Arabic (MSA) or Jordanian dialect. Experiments are conducted to evaluate the use of different weight schemes, stemming and N-grams terms techniques and scenarios. The experimental results provide the best scenario for each classifier and indicate that SVM classifier using term frequency–inverse document frequency (TF-IDF) weighting scheme with stemming through Bigrams feature outperforms the Naive Bayesian classifier best scenario performance results. Furthermore, this study results outperformed other results from comparable related work.

[1]  Khaled Shaalan,et al.  Arabic Natural Language Processing: Challenges and Solutions , 2009, TALIP.

[2]  Luis Alfonso Ureña López,et al.  Experiments with SVM to classify opinions in different domains , 2011, Expert Syst. Appl..

[3]  Chung-Hsien Wu,et al.  Introduction to the Special Issue on Recent Advances in Asian Language Spoken Document Retrieval , 2009, TALIP.

[4]  Chris Callison-Burch,et al.  The Arabic Online Commentary Dataset: an Annotated Dataset of Informal Arabic with High Dialectal Content , 2011, ACL.

[5]  Khaled Nagi,et al.  Sentiment Analysis of Colloquial Arabic Tweets , 2014 .

[6]  Luis Alfonso Ureña López,et al.  OCA: Opinion corpus for Arabic , 2011, J. Assoc. Inf. Sci. Technol..

[7]  A. Shoukry,et al.  Sentence-level Arabic sentiment analysis , 2012, 2012 International Conference on Collaboration Technologies and Systems (CTS).

[8]  Hend Suliman Al-Khalifa,et al.  Subjectivity and sentiment analysis of Arabic: Trends and challenges , 2014, 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA).

[9]  Khaled Shaalan,et al.  A Survey of Arabic Named Entity Recognition and Classification , 2014, CL.

[10]  Khaled Shaalan,et al.  A Review and Future Perspectives of Arabic Question Answering Systems , 2016, IEEE Transactions on Knowledge and Data Engineering.

[11]  Mahmoud Al-Ayyoub,et al.  Arabic sentiment analysis: Lexicon-based and corpus-based , 2013, 2013 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT).

[12]  Amir F. Atiya,et al.  LABR: A Large Scale Arabic Book Reviews Dataset , 2013, ACL.

[13]  Ahmed Rafea,et al.  A Hybrid Approach for Sentiment Classification of Egyptian Dialect Tweets , 2015, 2015 First International Conference on Arabic Computational Linguistics (ACLing).

[14]  Namita Mittal,et al.  Prominent Feature Extraction for Sentiment Analysis , 2015, Socio-Affective Computing.

[15]  Rehab Duwairi,et al.  Arabic Sentiment Analysis Using Supervised Classification , 2014, 2014 International Conference on Future Internet of Things and Cloud.

[16]  Verena Rieser,et al.  An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis , 2014, LREC.

[17]  Khaled Shaalan,et al.  Towards Improving Sentiment Analysis in Arabic , 2016, AISI.