Annotating Targets of Opinions in Arabic using Crowdsourcing

We present a method for annotating targets of opinions in Arabic in a two-stage process using the crowdsourcing tool Amazon Mechanical Turk. The first stage consists of identifying candidate targets “entities” in a given text. The second stage consists of identifying the opinion polarity (positive, negative, or neutral) expressed about a specific entity. We annotate a corpus of Arabic text using this method, selecting our data from online commentaries in different domains. Despite the complexity of the task, we find high agreement. We present detailed analysis.

[1]  Chiara Higgins,et al.  MTurk Crowdsourcing: A Viable Method for Rapid Discovery of Arabic Nicknames? , 2010, Mturk@HLT-NAACL.

[2]  Tiejun Zhao,et al.  Target-dependent Twitter Sentiment Classification , 2011, ACL.

[3]  Luís Sarmento,et al.  Liars and Saviors in a Sentiment Annotated Corpus of Comments to Political Debates , 2011, ACL.

[4]  Alon Lavie,et al.  Turker-Assisted Paraphrasing for English-Arabic Machine Translation , 2010, Mturk@HLT-NAACL.

[5]  Iryna Gurevych,et al.  Sentence and Expression Level Annotation of Opinions in User-Generated Discourse , 2010, ACL.

[6]  Eduard Hovy,et al.  Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text , 2006 .

[7]  Janyce Wiebe,et al.  MPQA 3.0: An Entity/Event-Level Sentiment Corpus , 2015, NAACL.

[8]  Claire Cardie,et al.  Annotating Topics of Opinions , 2008, LREC.

[9]  Swapna Somasundaran,et al.  Recognizing Stances in Online Debates , 2009, ACL.

[10]  Muhammad Abdul-Mageed,et al.  Subjectivity and Sentiment Annotation of Modern Standard Arabic Newswire , 2011, Linguistic Annotation Workshop.

[11]  Theresa Wilson Fine-grained subjectivity and sentiment analysis: recognizing the intensity, polarity, and attitudes of private states , 2008 .

[12]  Meliha Yetisgen-Yildiz,et al.  Annotating Large Email Datasets for Named Entity Recognition with Mechanical Turk , 2010, Mturk@HLT-NAACL.

[13]  Swapna Somasundaran,et al.  Discourse Level Opinion Relations: An Annotation Study , 2008, SIGDIAL Workshop.

[14]  Swapna Somasundaran,et al.  Finding the Sources and Targets of Subjective Expressions , 2008, LREC.

[15]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[16]  Kemal Oflazer,et al.  Large Scale Arabic Error Annotation: Guidelines and Framework , 2014, LREC.

[17]  Chris Callison-Burch,et al.  The Arabic Online Commentary Dataset: an Annotated Dataset of Informal Arabic with High Dialectal Content , 2011, ACL.

[18]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[19]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[20]  Nizar Habash,et al.  Introduction to Arabic Natural Language Processing , 2010, Introduction to Arabic Natural Language Processing.

[21]  Nizar Habash,et al.  The First QALB Shared Task on Automatic Text Correction for Arabic , 2014, ANLP@EMNLP.

[22]  Khaled Shaalan Nizar Y. Habash, Introduction to Arabic natural language processing (Synthesis lectures on human language technologies) , 2011, Machine Translation.

[23]  Christopher D. Manning,et al.  Better Arabic Parsing: Baselines, Evaluations, and Analysis , 2010, COLING.

[24]  Suresh Manandhar,et al.  SemEval-2014 Task 4: Aspect Based Sentiment Analysis , 2014, *SEMEVAL.

[25]  Chun Chen,et al.  Opinion Word Expansion and Target Extraction through Double Propagation , 2011, CL.