Representations and Architectures in Neural Sentiment Analysis for Morphologically Rich Languages: A Case Study from Modern Hebrew

This paper empirically studies the effects of representation choices on neural sentiment analysis for Modern Hebrew, a morphologically rich language (MRL) for which no sentiment analyzer currently exists. We study two dimensions of representational choices: (i) the granularity of the input signal (token-based vs. morpheme-based), and (ii) the level of encoding of vocabulary items (string-based vs. character-based). We hypothesise that for MRLs, languages where multiple meaning-bearing elements may be carried by a single space-delimited token, these choices will have measurable effects on task perfromance, and that these effects may vary for different architectural designs — fully-connected, convolutional or recurrent. Specifically, we hypothesize that morpheme-based representations will have advantages in terms of their generalization capacity and task accuracy, due to their better OOV coverage. To empirically study these effects, we develop a new sentiment analysis benchmark for Hebrew, based on 12K social media comments, and provide two instances of these data: in token-based and morpheme-based settings. Our experiments show that representation choices empirical effects vary with architecture type. While fully-connected and convolutional networks slightly prefer token-based settings, RNNs benefit from a morpheme-based representation, in accord with the hypothesis that explicit morphological information may help generalize. Our endeavour also delivers the first state-of-the-art broad-coverage sentiment analyzer for Hebrew, with over 89% accuracy, alongside an established benchmark to further study the effects of linguistic representation choices on neural networks’ task performance.

[1]  Wenpeng Yin,et al.  Comparative Study of CNN and RNN for Natural Language Processing , 2017, ArXiv.

[2]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[3]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[4]  Ming Zhou,et al.  Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification , 2014, ACL.

[5]  Noah A. Smith,et al.  Transition-Based Dependency Parsing with Stack Long Short-Term Memory , 2015, ACL.

[6]  Alex Graves,et al.  Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.

[7]  Yannick Versley,et al.  Statistical Parsing of Morphologically Rich Languages (SPMRL) What, How and Whither , 2010, SPMRL@NAACL-HLT.

[8]  Cícero Nogueira dos Santos,et al.  Learning Character-level Representations for Part-of-Speech Tagging , 2014, ICML.

[9]  Reut Tsarfaty,et al.  The Interplay of Syntax and Morphology in Building Parsing Models for Modern Hebrew , 2006 .

[10]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[11]  Jun Zhao,et al.  Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks , 2015, ACL.

[12]  Dan Klein,et al.  Neural CRF Parsing , 2015, ACL.

[13]  Mahmoud Al-Ayyoub,et al.  Arabic sentiment analysis: Lexicon-based and corpus-based , 2013, 2013 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT).

[14]  Bernhard Rieder,et al.  Studying Facebook via data extraction: the Netvizz application , 2013, WebSci.

[15]  Mike Thelwall,et al.  Twitter, MySpace, Digg: Unsupervised Sentiment Analysis in Social Media , 2012, TIST.

[16]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[17]  Muhammad Abdul-Mageed,et al.  SAMAR: Subjectivity and sentiment analysis for Arabic social media , 2014, Comput. Speech Lang..

[18]  Noah A. Smith,et al.  Improved Transition-based Parsing by Modeling Characters instead of Words with LSTMs , 2015, EMNLP.

[19]  Cícero Nogueira dos Santos,et al.  Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts , 2014, COLING.

[20]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[21]  Bowen Zhou,et al.  Classifying Relations by Ranking with Convolutional Neural Networks , 2015, ACL.

[22]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Hazem M. Hajj,et al.  Deep Learning Models for Sentiment Analysis in Arabic , 2015, ANLP@ACL.

[25]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[26]  E. Colleoni,et al.  Measuring Organizational Legitimacy in Social Media: Assessing Citizens’ Judgments With Sentiment Analysis , 2018 .

[27]  Reut Tsarfaty,et al.  Data-Driven Morphological Analysis and Disambiguation for Morphologically Rich Languages and Universal Dependencies , 2016, COLING.

[28]  Khalil Sima'an,et al.  Building a tree-bank of modern hebrew text , 2001 .

[29]  Yoav Goldberg,et al.  A Primer on Neural Network Models for Natural Language Processing , 2015, J. Artif. Intell. Res..

[30]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[31]  Giuseppe Porro,et al.  Every tweet counts? How sentiment analysis of social media can improve our knowledge of citizens’ political preferences with an application to Italy and France , 2013, New Media Soc..

[32]  Wang Ling,et al.  Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation , 2015, EMNLP.

[33]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.