Deep learning approaches for Arabic sentiment analysis

Social media are considered an excellent source of information and can provide opinions, thoughts and insights toward various important topics. Sentiment analysis becomes a hot topic in research due to its importance in making decisions based on opinions derived from analyzing the user’s contents on social media. Although the Arabic language is one of the widely spoken languages used for content sharing across the social media, the sentiment analysis on Arabic contents is limited due to several challenges including the morphological structures of the language, the varieties of dialects and the lack of the appropriate corpora. Hence, the rapid increase in research in Arabic sentiment analysis is grown slowly in contrast to other languages such as English. The contribution of this paper is twofold: First, we introduce a corpus of forty thousand labeled Arabic tweets spanning several topics. Second, we present three deep learning models, namely CNN, LSTM and RCNN, for Arabic sentiment analysis. With the help of word embedding, we validate the performance of the three models on the proposed corpus. The experimental results indicate that LSTM with an average accuracy of 81.31% outperforms CNN and RCNN. Also, applying data augmentation on the corpus increases LSTM accuracy by 8.3%.

[1]  Muhammad Badruddin Khan,et al.  Sentiment Analysis Challenges of Informal Arabic Language , 2017 .

[2]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[3]  Khaled Shaalan,et al.  Arabic Natural Language Processing: Challenges and Solutions , 2009, TALIP.

[4]  Antoni Mauricio,et al.  A Deep Learning Approach for Sentiment Analysis in Spanish Tweets , 2018, ICANN.

[5]  Nabeel Mohammed,et al.  Sentiment Analysis on Bangla and Romanized Bangla Text (BRBT) using Deep Recurrent models , 2016, ArXiv.

[6]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[7]  Yann LeCun,et al.  Understanding Deep Architectures using a Recursive Convolutional Network , 2013, ICLR.

[8]  Vadlamani Ravi,et al.  A survey on opinion mining and sentiment analysis: Tasks, approaches and applications , 2015, Knowl. Based Syst..

[9]  Pengfei Duan,et al.  Word Embeddings and Convolutional Neural Network for Arabic Sentiment Classification , 2016, COLING.

[10]  Haris Papageorgiou,et al.  SemEval-2016 Task 5: Aspect Based Sentiment Analysis , 2016, *SEMEVAL.

[11]  Nagwa M. El-Makky,et al.  Sentiment Analysis of Arabic Tweets using Deep Learning , 2018, ACLING.

[12]  Altyeb Altaher,et al.  Hybrid approach for sentiment analysis of Arabic tweets based on deep learning model and features weighting , 2017 .

[13]  M. Pasquier,et al.  Key issues in conducting sentiment analysis on Arabic social media text , 2013, 2013 9th International Conference on Innovations in Information Technology (IIT).

[14]  Aqil M. Azmi,et al.  Arabic tweets sentiment analysis – a hybrid scheme , 2016, J. Inf. Sci..

[15]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[16]  Amir F. Atiya,et al.  ASTD: Arabic Sentiment Tweets Dataset , 2015, EMNLP.

[17]  S. R. El-Beltagy,et al.  Open issues in the sentiment analysis of Arabic social media: A case study , 2013, 2013 9th International Conference on Innovations in Information Technology (IIT).

[18]  Jiebo Luo,et al.  Twitter Sentiment Analysis via Bi-sense Emoji Embedding and Attention-based LSTM , 2018, ACM Multimedia.

[19]  Nemanja Spasojevic,et al.  Actionable and Political Text Classification using Word Embeddings and LSTM , 2016, ArXiv.

[20]  Shuai Wang,et al.  Deep learning for sentiment analysis: A survey , 2018, WIREs Data Mining Knowl. Discov..

[21]  Samhaa R. El-Beltagy,et al.  AraVec: A set of Arabic Word Embedding Models for use in Arabic NLP , 2017, ACLING.

[22]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[23]  Li Zhao,et al.  Attention-based LSTM for Aspect-level Sentiment Classification , 2016, EMNLP.

[24]  Aymen Elkhlifi,et al.  Microblogging Opinion Mining Approach for Kuwaiti Dialect , 2014 .

[25]  Claudio Moraga,et al.  The Influence of the Sigmoid Function Parameters on the Speed of Backpropagation Learning , 1995, IWANN.

[26]  Minlie Huang,et al.  Modeling Rich Contexts for Sentiment Classification with LSTM , 2016, ArXiv.

[27]  Faridahwati Mohd Shamsudin,et al.  Internet misuse at work in Jordan: Challenges and implications , 2015 .

[28]  Matthew England,et al.  Arabic language sentiment analysis on health services , 2017, 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR).

[29]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  R. M. Duwairi,et al.  Sentiment Analysis in Arabic tweets , 2014, 2014 5th International Conference on Information and Communication Systems (ICICS).

[31]  Kareem Darwish,et al.  Subjectivity and Sentiment Analysis of Modern Standard Arabic and Arabic Microblogs , 2013, WASSA@NAACL-HLT.

[32]  Sosuke Kobayashi,et al.  Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations , 2018, NAACL.

[33]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Hend Suliman Al-Khalifa,et al.  Exploring the problems of sentiment analysis in informal Arabic , 2012, IIWAS '12.

[35]  Khaled Shaalan,et al.  Arabic Tweets Sentimental Analysis Using Machine Learning , 2017, IEA/AIE.

[36]  Muhammad Abdul-Mageed,et al.  AWATIF: A Multi-Genre Corpus for Modern Standard Arabic Subjectivity and Sentiment Analysis , 2012, LREC.

[37]  A. Shoukry,et al.  Sentence-level Arabic sentiment analysis , 2012, 2012 International Conference on Collaboration Technologies and Systems (CTS).

[38]  Hazem M. Hajj,et al.  Comparative Evaluation of Sentiment Analysis Methods Across Arabic Dialects , 2017, ACLING.

[39]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[40]  Miguel A. Alonso,et al.  Supervised sentiment analysis in multilingual environments , 2017, Inf. Process. Manag..

[41]  Aleksander Smywinski-Pohl,et al.  Towards textual data augmentation for neural networks: synonyms and maximum loss , 2019, Comput. Sci..

[42]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[43]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[44]  Mike Thelwall,et al.  Sentiment in Twitter events , 2011, J. Assoc. Inf. Sci. Technol..

[45]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[46]  Muhammad Abdul-Mageed,et al.  Subjectivity and Sentiment Analysis of Modern Standard Arabic , 2011, ACL.

[47]  Matthew England,et al.  A Combined CNN and LSTM Model for Arabic Sentiment Analysis , 2018, CD-MAKE.