Meta-level sentiment models for big social data analysis

People react to events, topics and entities by expressing their personal opinions and emotions. These reactions can correspond to a wide range of intensities, from very mild to strong. An adequate processing and understanding of these expressions has been the subject of research in several fields, such as business and politics. In this context, Twitter sentiment analysis, which is the task of automatically identifying and extracting subjective information from tweets, has received increasing attention from the Web mining community. Twitter provides an extremely valuable insight into human opinions, as well as new challenging Big Data problems. These problems include the processing of massive volumes of streaming data, as well as the automatic identification of human expressiveness within short text messages. In that area, several methods and lexical resources have been proposed in order to extract sentiment indicators from natural language texts at both syntactic and semantic levels. These approaches address different dimensions of opinions, such as subjectivity, polarity, intensity and emotion. This article is the first study of how these resources, which are focused on different sentiment scopes, complement each other. With this purpose we identify scenarios in which some of these resources are more useful than others. Furthermore, we propose a novel approach for sentiment classification based on meta-level features. This supervised approach boosts existing sentiment classification of subjectivity and polarity detection on Twitter. Our results show that the combination of meta-level features provides significant improvements in performance. However, we observe that there are important differences that rely on the type of lexical resource, the dataset used to build the model, and the learning strategy. Experimental results indicate that manually generated lexicons are focused on emotional words, being very useful for polarity prediction. On the other hand, lexicons generated with automatic methods include neutral words, introducing noise in the detection of subjectivity. Our findings indicate that polarity and subjectivity prediction are different dimensions of the same problem, but they need to be addressed using different subspace features. Lexicon-based approaches are recommendable for polarity, and stylistic part-of-speech based approaches are meaningful for subjectivity. With this research we offer a more global insight of the resource components for the complex task of classifying human emotion and opinion.

[1]  P. Wilson,et al.  The Nature of Emotions , 2012 .

[2]  Erik Cambria,et al.  SenticNet 3: A Common and Common-Sense Knowledge Base for Cognition-Driven Sentiment Analysis , 2014, AAAI.

[3]  Erik Cambria,et al.  Jumping NLP Curves: A Review of Natural Language Processing Research [Review Article] , 2014, IEEE Computational Intelligence Magazine.

[4]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[5]  M. Bradley,et al.  Affective Norms for English Words (ANEW): Instruction Manual and Affective Ratings , 1999 .

[6]  Björn W. Schuller,et al.  New Avenues in Opinion Mining and Sentiment Analysis , 2013, IEEE Intelligent Systems.

[7]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[8]  Aba-Sah Dadzie,et al.  Making Sense of Microposts 2011: Proceedings of the ESWC2011 Workshop on 'Making Sense of Microposts': Big Things Come in Small Packages: co-located with the 8th Extended Semantic Web Conference, ESWC2011, Heraklion, Crete, May 30, 2011 , 2011 .

[9]  Tiejun Zhao,et al.  Target-dependent Twitter Sentiment Classification , 2011, ACL.

[10]  Mário J. Silva,et al.  Clues for detecting irony in user-generated contents: oh...!! it's "so easy" ;-) , 2009, TSA@CIKM.

[11]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[12]  Saif Mohammad,et al.  NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[13]  Heiner Stuckenschmidt,et al.  Fine-Grained Sentiment Analysis with Structural Features , 2011, IJCNLP.

[14]  Erik Cambria,et al.  SenticNet 2: A Semantic and Affective Resource for Opinion Mining and Sentiment Analysis , 2012, FLAIRS.

[15]  Mike Thelwall,et al.  Sentiment strength detection for the social web , 2012, J. Assoc. Inf. Sci. Technol..

[16]  Saif Mohammad,et al.  CROWDSOURCING A WORD–EMOTION ASSOCIATION LEXICON , 2013, Comput. Intell..

[17]  Erik Cambria,et al.  Sentic Computing: Techniques, Tools, and Applications , 2012 .

[18]  Harith Alani,et al.  Alleviating Data Sparsity for Twitter Sentiment Analysis , 2012, #MSM.

[19]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[20]  Finn Årup Nielsen,et al.  A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.

[21]  Daniel J. Olsher Full spectrum opinion mining: integrating domain, syntactic and lexical knowledge , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[22]  Francesco Piazza,et al.  Sentic Web: A New Paradigm for Managing Social Media Affective Information , 2011, Cognitive Computation.

[23]  Rui Xia,et al.  Feature Ensemble Plus Sample Selection: Domain Adaptation for Sentiment Classification , 2013, IEEE Intelligent Systems.

[24]  Fabrício Benevenuto,et al.  Comparing and combining sentiment analysis methods , 2013, COSN '13.

[25]  Jonathon Read,et al.  Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification , 2005, ACL.

[26]  Ricardo Baeza-Yates,et al.  How Bad Do You Spell?: The Lexical Quality of Social Media , 2011, The Future of the Social Web.

[27]  Preslav Nakov,et al.  SemEval-2013 Task 2: Sentiment Analysis in Twitter , 2013, *SEMEVAL.

[28]  Marcelo Mendoza,et al.  Combining strengths, emotions and polarities for boosting Twitter sentiment analysis , 2013, WISDOM '13.

[29]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[30]  Bing Liu,et al.  Sentiment Analysis and Opinion Mining , 2012, Synthesis Lectures on Human Language Technologies.

[31]  Erik Cambria,et al.  The Hourglass of Emotions , 2011, COST 2102 Training School.

[32]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[33]  R. Plutchik Human emotions have deep evolutionary roots, a fact that may explain their complexity and provide tools for clinical practice , 2016 .

[34]  Minyi Guo,et al.  Emoticon Smoothed Language Models for Twitter Sentiment Analysis , 2012, AAAI.

[35]  Jane Yung-jen Hsu,et al.  Building a Concept-Level Sentiment Dictionary Based on Commonsense Knowledge , 2013, IEEE Intelligent Systems.

[36]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.