Are emoticons good enough to train emotion classifiers of Arabic tweets?

Nowadays, the automatic detection of emotions is employed by many applications across different fields like security informatics, e-learning, humor detection, targeted advertising, etc. Many of these applications focus on social media. In this study, we address the problem of emotion detection in Arabic tweets. We focus on the supervised approach for this problem where a classifier is trained on an already labeled dataset. Typically, such a training set is manually annotated, which is expensive and time consuming. We propose to use an automatic approach to annotate the training data based on using emojis, which are a new generation of emoticons. We show that such an approach produces classifiers that are more accurate than the ones trained on a manually annotated dataset. To achieve our goal, a dataset of emotional Arabic tweets is constructed, where the emotion classes under consideration are: anger, disgust, joy and sadness. Moreover, we consider two classifiers: Support Vector Machine (SVM) and Multinomial Naive Bayes (MNB). The results of the tests show that the automatic labeling approaches using SVM and MNB outperform manual labeling approaches.

[1]  Mervat Gheith,et al.  Lexicon Based and Multi-Criteria Decision Making (MCDM) Approach for Detecting Emotions from Arabic Microblog Text , 2015, 2015 First International Conference on Arabic Computational Linguistics (ACLing).

[2]  Saif Mohammad,et al.  CROWDSOURCING A WORD–EMOTION ASSOCIATION LEXICON , 2013, Comput. Intell..

[3]  Hend Suliman Al-Khalifa,et al.  Subjectivity and sentiment analysis of Arabic: Trends and challenges , 2014, 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA).

[4]  Verena Rieser,et al.  Evaluating Distant Supervision for Subjectivity and Sentiment Analysis on Arabic Twitter Feeds , 2014, ANLP@EMNLP.

[5]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[6]  Osamu Uchida,et al.  A method for automatically generating the emotional vectors of emoticons using weblog articles , 2011 .

[7]  Muhammad Usman,et al.  Iris Recognition using Mel-Fequency Cepstral Coefficient , 2014 .

[8]  Jesse Read,et al.  Scalable Multi-label Classification , 2010 .

[9]  Panayiotis Bozanis,et al.  Identifying Influential Bloggers: Time Does Matter , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[10]  Torky I. Sultan,et al.  A Computational Approach for Analyzing and Detecting Emotions in Arabic Text , 2022 .

[11]  Arafat Awajan,et al.  Sentiment classification techniques for Arabic language: A survey , 2016, 2016 7th International Conference on Information and Communication Systems (ICICS).

[12]  Mahmoud Al-Ayyoub,et al.  On authorship authentication of Arabic articles , 2014, 2014 5th International Conference on Information and Communication Systems (ICICS).

[13]  Ke Xu,et al.  MoodLens: an emoticon-based sentiment analysis system for chinese tweets , 2012, KDD.

[14]  Elke A. Rundensteiner,et al.  EMOTEX: Detecting Emotions in Twitter Messages , 2014 .

[15]  R. Plutchik Emotion, a psychoevolutionary synthesis , 1980 .

[16]  Muhammad Abdul-Mageed,et al.  Subjectivity and Sentiment Analysis of Arabic: A Survey , 2012, AMLTA.

[17]  Finn Årup Nielsen,et al.  A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.

[18]  P. Ekman An argument for basic emotions , 1992 .

[19]  Mahmoud Al-Ayyoub,et al.  Emotion analysis of Arabic articles and its impact on identifying the author's gender , 2015, 2015 IEEE/ACS 12th International Conference of Computer Systems and Applications (AICCSA).

[20]  Carlo Strapparava,et al.  Learning to identify emotions in text , 2008, SAC '08.

[21]  Christian Sturm,et al.  Feel the Heat: Emotion Detection in Arabic Social Media Content , 2014, ICDM 2014.

[22]  Mahmoud Al-Ayyoub,et al.  Scalable multi-label Arabic text classification , 2015, 2015 6th International Conference on Information and Communication Systems (ICICS).