AROMA

While research on English opinion mining has already achieved significant progress and success, work on Arabic opinion mining is still lagging. This is mainly due to the relative recency of research efforts in developing natural language processing (NLP) methods for Arabic, handling its morphological complexity, and the lack of large-scale opinion resources for Arabic. To close this gap, we examine the class of models used for English and that do not require extensive use of NLP or opinion resources. In particular, we consider the Recursive Auto Encoder (RAE). However, RAE models are not as successful in Arabic as they are in English, due to their limitations in handling the morphological complexity of Arabic, providing a more complete and comprehensive input features for the auto encoder, and performing semantic composition following the natural way constituents are combined to express the overall meaning. In this article, we propose A Recursive Deep Learning Model for Opinion Mining in Arabic (AROMA) that addresses these limitations. AROMA was evaluated on three Arabic corpora representing different genres and writing styles. Results show that AROMA achieved significant performance improvements compared to the baseline RAE. It also outperformed several well-known approaches in the literature.

[1]  Preslav Nakov,et al.  Developing a successful SemEval task in sentiment analysis of Twitter and other social media texts , 2016, Language Resources and Evaluation.

[2]  Erik Cambria,et al.  Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis , 2015 .

[3]  Vadlamani Ravi,et al.  A survey on opinion mining and sentiment analysis: Tasks, approaches and applications , 2015, Knowl. Based Syst..

[4]  Ting Liu,et al.  Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.

[5]  Hazem M. Hajj,et al.  Deep Learning Models for Sentiment Analysis in Arabic , 2015, ANLP@ACL.

[6]  Hazem M. Hajj,et al.  A Light Lexicon-based Mobile Application for Sentiment Mining of Arabic Tweets , 2015, ANLP@ACL.

[7]  Nizar Habash,et al.  Annotating Targets of Opinions in Arabic using Crowdsourcing , 2015, ANLP@ACL.

[8]  Sherif Abdou,et al.  Sentiment Analysis For Modern Standard Arabic And Colloquial , 2015, ArXiv.

[9]  Xabier Artola,et al.  Big data for Natural Language Processing: A streaming approach , 2015, Knowl. Based Syst..

[10]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[11]  Nizar Habash,et al.  A Large Scale Arabic Sentiment Lexicon for Arabic Opinion Mining , 2014, ANLP@EMNLP.

[12]  Nizar Habash,et al.  The First QALB Shared Task on Automatic Text Correction for Arabic , 2014, ANLP@EMNLP.

[13]  Verena Rieser,et al.  An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis , 2014, LREC.

[14]  Nizar Habash,et al.  MADAMIRA: A Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic , 2014, LREC.

[15]  Muhammad Abdul-Mageed,et al.  SANA: A Large Scale Multi-Genre, Multi-Dialect Lexicon for Arabic Subjectivity and Sentiment Analysis , 2014, LREC.

[16]  Mahmoud Al-Ayyoub,et al.  An analytical study of Arabic sentiments: Maktoob case study , 2013, 8th International Conference for Internet Technology and Secured Transactions (ICITST-2013).

[17]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[18]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[19]  Fahad Alotaiby,et al.  Arabic vs. English: Comparative Statistical Study , 2013, Arabian Journal for Science and Engineering.

[20]  Amir F. Atiya,et al.  LABR: A Large Scale Arabic Book Reviews Dataset , 2013, ACL.

[21]  Houda Benbrahim,et al.  A cross-study of Sentiment Classification on Arabic corpora , 2012, SGAI Conf..

[22]  Yoshua Bengio,et al.  Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.

[23]  A. Shoukry,et al.  Sentence-level Arabic sentiment analysis , 2012, 2012 International Conference on Collaboration Technologies and Systems (CTS).

[24]  Luis Alfonso Ureña López,et al.  OCA: Opinion corpus for Arabic , 2011, J. Assoc. Inf. Sci. Technol..

[25]  Nizar Habash,et al.  Orthographic and morphological processing for English–Arabic statistical machine translation , 2011, Machine Translation.

[26]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[27]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[28]  Lei Zhang,et al.  Combining lexicon-based and learning-based methods for twitter sentiment analysis , 2011 .

[29]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[30]  Muhammad Abdul-Mageed,et al.  Subjectivity and Sentiment Analysis of Modern Standard Arabic , 2011, ACL.

[31]  Hsinchun Chen,et al.  Selecting Attributes for Sentiment Classification Using Feature Relation Networks , 2011, IEEE Transactions on Knowledge and Data Engineering.

[32]  Imed Zitouni,et al.  Book Reviews: Introduction to Arabic Natural Language Processing by Nizar Y. Habash , 2010, CL.

[33]  Christopher D. Manning,et al.  Better Arabic Parsing: Baselines, Evaluations, and Analysis , 2010, COLING.

[34]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[35]  Hsinchun Chen,et al.  Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums , 2008, TOIS.

[36]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[37]  Owen Rambow,et al.  Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop , 2005, ACL.

[38]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[39]  Unesco , 1973, International and Comparative Law Quarterly.

[40]  Noam Chomsky,et al.  On Certain Formal Properties of Grammars , 1959, Inf. Control..

[41]  Nizar Habash,et al.  Improving Arabic Diacritization through Syntactic Analysis , 2015, EMNLP.

[42]  Nazlia Omar,et al.  Ensemble of Classification Algorithms for Subjectivity and Sentiment Analysis of Arabic Customers ' Reviews , 2013 .

[43]  B. Liu,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2012, Mining Text Data.

[44]  Alec Go,et al.  Twitter Sentiment Classification using Distant Supervision , 2009 .

[45]  Christiane Fellbaum,et al.  Introducing the Arabic WordNet project , 2006 .

[46]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[47]  M. Maamouri,et al.  The Penn Arabic Treebank: Building a Large-Scale Annotated Arabic Corpus , 2004 .