Building a standard dataset for Arabie sentiment analysis: Identifying potential annotation pitfalls

Sentiment Analysis (SA) is one of the hottest research fields nowadays. It is concerned with identifying the sentiment conveyed in a piece of text. The current efforts in SA require the existence of standard datasets for training/testing purposes. Such datasets already exist for some languages such as English. Unfortunately, the same cannot be said about other languages such as Arabic. Currently existing Arabic SA datasets are restricted (in their domain, size, dialects covered, etc.) and/or have limited availability. Moreover, the annotation process did not receive the proper attention it deserves. Some of the existing datasets relied on the author's point of view for annotation, while others employed annotators, but did not take into account the personal variations between the annotators and how would that affect their agreement. This study presents our efforts to build a standard Arabic dataset with the above concerns in mind. The constructed dataset is intended for generic use as it contains reviews from different domains written in Modern Standard Arabic (MSA) as well as several dialects. As for the annotation process, it is given high attention by studying the inter-annotator agreements and investigating the potential factors affecting them.

[1]  Muhannad Quwaider,et al.  Human Annotated Arabic Dataset of Book Reviews for Aspect Based Sentiment Analysis , 2015, 2015 3rd International Conference on Future Internet of Things and Cloud.

[2]  Izzat Alsmadi,et al.  An Opinion Analysis Tool for Colloquial and Standard Arabic , 2013 .

[3]  Alan F. Smeaton,et al.  A study of inter-annotator agreement for opinion retrieval , 2009, SIGIR.

[4]  Ismail Hmeidi,et al.  A Comparative Study of Automatic Text Categorization Methods Using Arabic Text , 2015 .

[5]  Mahmoud Al-Ayyoub,et al.  On the Use of Arabic Tweets to Predict Stock Market Changes in the Arab World , 2016 .

[6]  Mahmoud Al-Ayyoub,et al.  Are emoticons good enough to train emotion classifiers of Arabic tweets? , 2016, 2016 7th International Conference on Computer Science and Information Technology (CSIT).

[7]  Mahmoud Al-Ayyoub,et al.  Hierarchical Classifiers for Multi-Way Sentiment Analysis of Arabic Reviews , 2016 .

[8]  Sarah O. Alhumoud,et al.  Survey on Arabic Sentiment Analysis in Twitter , 2015 .

[9]  Mahmoud Al-Ayyoub,et al.  Using Enhanced Lexicon-Based Approaches for the Determination of Aspect Categories and Their Polarities in Arabic Reviews , 2016, Int. J. Inf. Technol. Web Eng..

[10]  Muhammad Abdul-Mageed,et al.  Subjectivity and Sentiment Analysis of Arabic: A Survey , 2012, AMLTA.

[11]  Mahmoud Al-Ayyoub,et al.  An Aspect-Based Sentiment Analysis Approach to Evaluating Arabic News Affect on Readers , 2016, J. Univers. Comput. Sci..

[12]  Mahmoud Al-Ayyoub,et al.  An extended analytical study of Arabic sentiments , 2014, Int. J. Big Data Intell..

[13]  Mahmoud Al-Ayyoub,et al.  Measuring the controversy level of Arabic trending topics on Twitter , 2016, 2016 7th International Conference on Information and Communication Systems (ICICS).

[14]  Mahmoud Al-Ayyoub,et al.  A prototype for a standard arabic sentiment analysis corpus , 2016, Int. Arab J. Inf. Technol..

[15]  Mahmoud Al-Ayyoub,et al.  Towards Improving the Lexicon-Based Approach for Arabic Sentiment Analysis , 2014, Int. J. Inf. Technol. Web Eng..

[16]  Izzat Alsmadi,et al.  Opinion Mining and Analysis for Arabic Language , 2014 .

[17]  Samhaa R. El-Beltagy,et al.  Building Large Arabic Multi-domain Resources for Sentiment Analysis , 2015, CICLing.

[18]  Muhammad Abdul-Mageed,et al.  AWATIF: A Multi-Genre Corpus for Modern Standard Arabic Subjectivity and Sentiment Analysis , 2012, LREC.

[19]  Bashar Al Shboul,et al.  Multi-way sentiment classification of Arabic reviews , 2015, 2015 6th International Conference on Information and Communication Systems (ICICS).

[20]  Mahmoud Al-Ayyoub,et al.  Evaluating SentiStrength for Arabic Sentiment Analysis , 2016, 2016 7th International Conference on Computer Science and Information Technology (CSIT).

[21]  Suad Alhojely,et al.  Sentiment Analysis and Opinion Mining: A Survey , 2016 .

[22]  Arafat Awajan,et al.  Sentiment classification techniques for Arabic language: A survey , 2016, 2016 7th International Conference on Information and Communication Systems (ICICS).

[23]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[24]  Stefanie Nowak,et al.  How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation , 2010, MIR '10.

[25]  Luis Alfonso Ureña López,et al.  OCA: Opinion corpus for Arabic , 2011, J. Assoc. Inf. Sci. Technol..

[26]  Vikas Sindhwani,et al.  Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria , 2009, HLT-NAACL 2009.

[27]  Izzat Alsmadi,et al.  Sentiment analysis of arabic social media content: a comparative study , 2013, 8th International Conference for Internet Technology and Secured Transactions (ICITST-2013).

[28]  Hend Suliman Al-Khalifa,et al.  Subjectivity and sentiment analysis of Arabic: Trends and challenges , 2014, 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA).

[29]  Mahmoud Al-Ayyoub,et al.  Enhancing the determination of aspect categories and their polarities in Arabic reviews using lexicon-based approaches , 2015, 2015 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT).

[30]  Amir F. Atiya,et al.  ASTD: Arabic Sentiment Tweets Dataset , 2015, EMNLP.

[31]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[32]  Saif Mohammad,et al.  A Practical Guide to Sentiment Annotation: Challenges and Solutions , 2016, WASSA@NAACL-HLT.

[33]  Matthew Lease,et al.  On Quality Control and Machine Learning in Crowdsourcing , 2011, Human Computation.

[34]  Amir F. Atiya,et al.  LABR: A Large Scale Arabic Book Reviews Dataset , 2013, ACL.

[35]  Saif Mohammad,et al.  CROWDSOURCING A WORD–EMOTION ASSOCIATION LEXICON , 2013, Comput. Intell..

[36]  Mahmoud Al-Ayyoub,et al.  Using Aspect-Based Sentiment Analysis to Evaluate Arabic News Affect on Readers , 2015, 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing (UCC).

[37]  Hsin-Hsi Chen,et al.  Opinion Extraction, Summarization and Tracking in News and Blog Corpora , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[38]  Mahmoud Al-Ayyoub,et al.  Framework for affective news analysis of Arabic news: 2014 Gaza attacks case study , 2016, 2016 7th International Conference on Information and Communication Systems (ICICS).