Arabic language sentiment analysis on health services

The social media network phenomenon creates massive amounts of valuable data that is available online and easy to access. Many users share images, videos, comments, reviews, news and opinions on different social networks sites, with Twitter being one of the most popular ones. Data collected from Twitter is highly unstructured, and extracting useful information from tweets is a challenging task. Twitter has a huge number of Arabic users who mostly post and write their tweets using the Arabic language. While there has been a lot of research on sentiment analysis in English, the amount of researches and datasets in Arabic language is limited. This paper introduces an Arabic language dataset, which is about opinions on health services and has been collected from Twitter. The paper will first detail the process of collecting the data from Twitter and also the process of filtering, pre-processing and annotating the Arabic text in order to build a big sentiment analysis dataset in Arabic. Several Machine Learning algorithms (Naïve Bayes, Support Vector Machine and Logistic Regression) alongside Deep and Convolutional Neural Networks were utilized in our experiments of sentiment analysis on our health dataset.

[1]  A. Shoukry,et al.  Sentence-level Arabic sentiment analysis , 2012, 2012 International Conference on Collaboration Technologies and Systems (CTS).

[2]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[3]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[4]  Aymen Elkhlifi,et al.  Microblogging Opinion Mining Approach for Kuwaiti Dialect , 2014 .

[5]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[6]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[7]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[8]  Harith Alani,et al.  Semantic Sentiment Analysis of Twitter , 2012, SEMWEB.

[9]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[10]  Jason Baldridge,et al.  Twitter Polarity Classification with Label Propagation over Lexical Links and the Follower Graph , 2011, ULNLP@EMNLP.

[11]  Akshi Kumar,et al.  Sentiment Analysis on Twitter , 2012 .

[12]  Rahat Iqbal,et al.  Design implications for task-specific search utilities for retrieval and re-engineering of code , 2017, Enterp. Inf. Syst..

[13]  M. Pasquier,et al.  Key issues in conducting sentiment analysis on Arabic social media text , 2013, 2013 9th International Conference on Innovations in Information Technology (IIT).

[14]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[15]  Rahat Iqbal,et al.  Automated intelligent system for sound signalling device quality assurance , 2015, Inf. Sci..

[16]  Namita Mittal,et al.  Sentiment Analysis Using Common-Sense and Context Information , 2015, Comput. Intell. Neurosci..

[17]  Mahmoud Al-Ayyoub,et al.  Arabic sentiment analysis: Lexicon-based and corpus-based , 2013, 2013 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT).

[18]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.