BabelSenticNet: A Commonsense Reasoning Framework for Multilingual Sentiment Analysis

SenticNet is a concept-level knowledge base used to develop commonsense reasoning algorithms for sentiment analysis tasks. One of the challenges that this resource must overcome is its lack of availability for languages aside from English. Prototype algorithms have been recently proposed to create non-English language concept-level knowledge databases, but they rely on a number of heterogeneous resources that complicate comparison, reproducibility and maintenance. This paper proposes an easy and replicable method to automatically generate SenticNet for a variety of languages, obtaining as a result BabelSenticNet. We use statistical machine translation tools to create a high coverage SenticNet version for the target language. We then introduce an algorithm to increase the robustness of the translated resources, relying on a mapping technique, based on WordNet and its multilingual versions. SenticNet versions for 40 languages have been made available. Human-based evaluation on languages belonging to different families, alphabets and cultures proves the robustness of the method and its potential for utility in future research on multilingual concept-level sentiment analysis.

[1]  Erik Cambria,et al.  AffectiveSpace 2: Enabling Affective Intuition for Concept-Level Sentiment Analysis , 2015, AAAI.

[2]  Naomie Salim,et al.  Opinion analysis for twitter and arabic tweets: a systematic literature review , 2013 .

[3]  Steven Skiena,et al.  Building Sentiment Lexicons for All Major Languages , 2014, ACL.

[4]  Erik Cambria,et al.  Sentiment Data Flow Analysis by Means of Dynamic Linguistic Patterns , 2015, IEEE Computational Intelligence Magazine.

[5]  Davide Anguita,et al.  Statistical Learning Theory and ELM for Big Social Data Analysis , 2016, IEEE Computational Intelligence Magazine.

[6]  Erik Cambria,et al.  Recent Trends in Deep Learning Based Natural Language Processing , 2017, IEEE Comput. Intell. Mag..

[7]  Carlo Aliprandi,et al.  Sentiment Analysis on Social Media , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[8]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[9]  Erik Cambria,et al.  OntoSenticNet: A Commonsense Ontology for Sentiment Analysis , 2018, IEEE Intelligent Systems.

[10]  Preslav Nakov,et al.  SemEval-2015 Task 10: Sentiment Analysis in Twitter , 2015, *SEMEVAL.

[11]  Vaibhavi N Patodkar,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2016 .

[12]  Sitesh Kumar Sinha,et al.  A Survey of Translation Quality of English to Hindi Online Translation Systems (Google and Bing) , 2013 .

[13]  Erik Cambria,et al.  Sentic Computing for patient centered applications , 2010, IEEE 10th INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS.

[14]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[15]  S. Sinthupinyo,et al.  Applying latent semantic analysis to classify emotions in Thai text , 2010, 2010 2nd International Conference on Computer Engineering and Technology.

[16]  English Corpora,et al.  Cross-Linguistic Sentiment Analysis: From English to Spanish , 2009 .

[17]  Erik Cambria,et al.  Aspect extraction for opinion mining with a deep convolutional neural network , 2016, Knowl. Based Syst..

[18]  Erik Cambria,et al.  Affective Computing and Sentiment Analysis , 2016, IEEE Intelligent Systems.

[19]  David Yarowsky,et al.  Exploring Sentiment in Social Media: Bootstrapping Subjectivity Clues from Multilingual Twitter Streams , 2013, ACL.

[20]  Saif Mohammad,et al.  NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[21]  Miguel A. Alonso,et al.  The megaphone of the people? Spanish SentiStrength for real-time analysis of political tweets , 2015, J. Inf. Sci..

[22]  Yang Li,et al.  Learning multi-grained aspect target sequence for Chinese sentiment analysis , 2018, Knowl. Based Syst..

[23]  Miguel A. Alonso,et al.  On the usefulness of lexical and syntactic processing in polarity classification of Twitter messages , 2015, J. Assoc. Inf. Sci. Technol..

[24]  Nathanael Chambers,et al.  Learning for Microblogs with Distant Supervision: Political Forecasting with Twitter , 2012, EACL.

[25]  Erik Cambria,et al.  Merging SenticNet and WordNet-Affect emotion lists for sentiment analysis , 2012, 2012 IEEE 11th International Conference on Signal Processing.

[26]  Uzay Kaymak,et al.  Multi-lingual support for lexicon-based sentiment analysis guided by semantics , 2014, Decis. Support Syst..

[27]  Miguel A. Alonso,et al.  A syntactic approach for opinion mining on Spanish reviews , 2013, Natural Language Engineering.

[28]  Fabrício Benevenuto,et al.  iFeel: a system that compares and combines sentiment analysis methods , 2014, WWW.

[29]  Erik Cambria,et al.  Jumping NLP Curves: A Review of Natural Language Processing Research [Review Article] , 2014, IEEE Computational Intelligence Magazine.

[30]  Catherine Havasi,et al.  ConceptNet 5: A Large Semantic Network for Relational Knowledge , 2013, The People's Web Meets NLP.

[31]  Julio J. Castillo Using Machine Translation Systems to Expand a Corpus in Textual Entailment , 2010, IceTAL.

[32]  Stefan Conrad,et al.  Linguistic Sentiment Features for Newspaper Opinion Mining , 2013, NLDB.

[33]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[34]  David Jacot,et al.  Sentiment Analysis of French Movie Reviews , 2011, Advances in Distributed Agent-Based Retrieval Tools.

[35]  Jia-Fei Hong,et al.  中文词汇网络:跨语言知识处理基础架构的设计理念与实践 = Chinese wordnet : design, implementation, and application of an infrastructure for cross-lingual knowledge processing , 2010 .

[36]  Sivaji Bandyopadhyay,et al.  A Textual Entailment System using Web based Machine Translation System , 2011, NTCIR.

[37]  Raymond Chiong,et al.  Multilingual sentiment analysis: from formal to informal and scarce resource languages , 2016, Artificial Intelligence Review.

[38]  Erik Cambria,et al.  A Localization Toolkit for Sentic Net , 2014, 2014 IEEE International Conference on Data Mining Workshop.

[39]  Erik Cambria,et al.  Semi-supervised learning for big social data analysis , 2018, Neurocomputing.

[40]  Mike Thelwall,et al.  Sentiment strength detection for the social web , 2012, J. Assoc. Inf. Sci. Technol..

[41]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[42]  Takafumi Suzuki,et al.  Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets , 2014, J. Assoc. Inf. Sci. Technol..

[43]  Erik Cambria,et al.  SenticNet 5: Discovering Conceptual Primitives for Sentiment Analysis by Means of Context Embeddings , 2018, AAAI.

[44]  Piek Vossen,et al.  EuroWordNet: A multilingual database with lexical semantic networks , 1998, Springer Netherlands.

[45]  Alessandro Moschitti,et al.  UNITN: Training Deep Convolutional Neural Network for Twitter Sentiment Classification , 2015, *SEMEVAL.

[46]  Nishantha Medagoda,et al.  A comparative analysis of opinion mining and sentiment classification in non-english languages , 2013, 2013 International Conference on Advances in ICT for Emerging Regions (ICTer).

[47]  Erik Cambria,et al.  A Localization Toolkit for SenticNet , 2015 .

[48]  Ralf Steinberger,et al.  A survey of methods to ease the development of highly multilingual text mining applications , 2011, Language Resources and Evaluation.

[49]  Quan Pan,et al.  A Generative Model for category text generation , 2018, Inf. Sci..

[50]  Quan Pan,et al.  Learning Word Representations for Sentiment Analysis , 2017, Cognitive Computation.