A Large Scale Arabic Sentiment Lexicon for Arabic Opinion Mining

Most opinion mining methods in English rely successfully on sentiment lexicons, such as English SentiWordnet (ESWN). While there have been efforts towards building Arabic sentiment lexicons, they suffer from many deficiencies: limited size, unclear usability plan given Arabic’s rich morphology, or nonavailability publicly. In this paper, we address all of these issues and produce the first publicly available large scale Standard Arabic sentiment lexicon (ArSenL) using a combination of existing resources: ESWN, Arabic WordNet, and the Standard Arabic Morphological Analyzer (SAMA). We compare and combine two methods of constructing this lexicon with an eye on insights for Arabic dialects and other low resource languages. We also present an extrinsic evaluation in terms of subjectivity and sentiment analysis.

[1]  Aly A. Fahmy,et al.  A Machine Learning Approach For Opinion Holder Extraction In Arabic Language , 2012, ArXiv.

[2]  Andrea Esuli,et al.  Determining the semantic orientation of terms through gloss classification , 2005, CIKM '05.

[3]  Kerstin Denecke,et al.  Using SentiWordNet for multilingual sentiment analysis , 2008, 2008 IEEE 24th International Conference on Data Engineering Workshop.

[4]  William J. Black,et al.  Arabic SentiWordNet in Relation to SentiWordNet 3.0 , 2013 .

[5]  Andrea Esuli,et al.  Determining the semantic orientation of terms through gloss analysis , 2005, CIKM 2005.

[6]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[7]  Nizar Habash,et al.  Introduction to Arabic Natural Language Processing , 2010, Introduction to Arabic Natural Language Processing.

[8]  Christiane Fellbaum,et al.  Introducing the Arabic WordNet project , 2006 .

[9]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[10]  Khaled Shaalan Nizar Y. Habash, Introduction to Arabic natural language processing (Synthesis lectures on human language technologies) , 2011, Machine Translation.

[11]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[12]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[13]  Nizar Habash,et al.  Tharwa: A Large Scale Dialectal Arabic - Standard Arabic - English Lexicon , 2014, LREC.

[14]  Bing Liu,et al.  Sentiment Analysis and Subjectivity , 2010, Handbook of Natural Language Processing.

[15]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[16]  Steven Skiena,et al.  Building Sentiment Lexicons for All Major Languages , 2014, ACL.

[17]  M. Maamouri,et al.  The Penn Arabic Treebank: Building a Large-Scale Annotated Arabic Corpus , 2004 .

[18]  Muhammad Abdul-Mageed,et al.  Subjectivity and Sentiment Analysis of Modern Standard Arabic , 2011, ACL.

[19]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[20]  Muhammad Abdul-Mageed,et al.  SANA: A Large Scale Multi-Genre, Multi-Dialect Lexicon for Arabic Subjectivity and Sentiment Analysis , 2014, LREC.

[21]  Nizar Habash,et al.  Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop , 2005, ACL.

[22]  Marco Guerini,et al.  Depeche Mood: a Lexicon for Emotion Analysis from Crowd Annotated News , 2014, ACL.

[23]  Bruno Ohana,et al.  Sentiment Classification of Reviews Using SentiWordNet , 2009 .