Enriching Lexicons with Ephemeral Words for Sentiment Analysis in Social Streams

Lexical approaches for sentiment analysis like SentiWordNet rely upon a fixed dictionary of words with fixed sentiment, i.e., sentiment that does not change. With the rise of Web 2.0 however, what we observe more and more often is that words that are not sentimental per se, are often associated with positive/negative feelings, for example, "refugees", "Trump", "iphone". Typically, those feelings are temporary as responses to external events; for example, "iphone" sentiment upon latest iphone version release or "Trump" sentiment after USA withdraw from Paris climate agreement. In this work, we propose an approach for extracting and monitoring what we call ephemeral words from social streams; these are words that convey sentiment without being sentimental and their sentiment might change with time. Such sort of words cannot be part of a lexicon like SentiWordNet since their sentiment has an ephemeral character, however detecting such words and estimating their sentiment can significantly improve the performance of lexicon-based approaches, as our experiments show.

[1]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[2]  Myra Spiliopoulou,et al.  Ageing-Based Multinomial Naive Bayes Classifiers Over Opinionated Data Streams , 2015, ECML/PKDD.

[3]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[4]  Giuseppe Ottaviano,et al.  Fast and Space-Efficient Entity Linking for Queries , 2015, WSDM.

[5]  Saif Mohammad,et al.  NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[6]  Eirini Ntoutsi,et al.  Large Scale Sentiment Learning with Limited Labels , 2017, KDD.

[7]  Usman Qamar,et al.  SentiMI: Introducing point-wise mutual information with SentiWordNet to improve sentiment polarity detection , 2016, Appl. Soft Comput..

[8]  Hsinchun Chen,et al.  A Lexicon-Enhanced Method for Sentiment Classification: An Experiment on Online Product Reviews , 2010, IEEE Intelligent Systems.

[9]  Myra Spiliopoulou,et al.  Opinion Stream Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[10]  Juan D. Velásquez,et al.  Twitter Sentiment Polarity Analysis: A Novel Approach for Improving the Automated Labeling in a Text Corpora , 2014, AMT.

[11]  Fredrik Olsson,et al.  Usefulness of Sentiment Analysis , 2012, ECIR.

[12]  Bruno Ohana,et al.  Sentiment Classification of Reviews Using SentiWordNet , 2009 .

[13]  Andrés Montoyo,et al.  SSA-UO: Unsupervised Sentiment Analysis in Twitter , 2013, *SEMEVAL.

[14]  Harith Alani,et al.  Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new dataset, the STS-Gold , 2013, ESSEM@AI*IA.

[15]  Tomoharu Iwata,et al.  Learning Latest Classifiers without Additional Labeled Data , 2017, IJCAI.

[16]  Reynier Ortega,et al.  SSA-UO: Unsupervised Twitter Sentiment Analysis , 2013 .

[17]  Harith Alani,et al.  SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter , 2014, ESWC.

[18]  Albert Bifet,et al.  Sentiment Knowledge Discovery in Twitter Streaming Data , 2010, Discovery Science.

[19]  Lei Zhang,et al.  Combining lexicon-based and learning-based methods for twitter sentiment analysis , 2011 .

[20]  John Cristian Borges Gamboa,et al.  Deep Learning for Time-Series Analysis , 2017, ArXiv.

[21]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[22]  Harith Alani,et al.  Semantic Sentiment Analysis of Twitter , 2012, SEMWEB.

[23]  Tomoharu Iwata,et al.  Learning Future Classifiers without Additional Data , 2016, AAAI.

[24]  Eirini Ntoutsi,et al.  Sentiment Classification over Opinionated Data Streams Through Informed Model Adaptation , 2017, TPDL.