Sentiment Spreading: An Epidemic Model for Lexicon-Based Sentiment Analysis on Twitter

While sentiment analysis has received significant attention in the last years, problems still exist when tools need to be applied to microblogging content. This because, typically, the text to be analysed consists of very short messages lacking in structure and semantic context. At the same time, the amount of text produced by online platforms is enormous. So, one needs simple, fast and effective methods in order to be able to efficiently study sentiment in these data. Lexicon-based methods, which use a predefined dictionary of terms tagged with sentiment valences to evaluate sentiment in longer sentences, can be a valid approach. Here we present a method based on epidemic spreading to automatically extend the dictionary used in lexicon-based sentiment analysis, starting from a reduced dictionary and large amounts of Twitter data. The resulting dictionary is shown to contain valences that correlate well with human-annotated sentiment, and to produce tweet sentiment classifications comparable to the original dictionary, with the advantage of being able to tag more tweets than the original. The method is easily extensible to various languages and applicable to large amounts of data.

[1]  Rajiv Ramnath,et al.  Towards building large-scale distributed systems for twitter sentiment analysis , 2012, SAC '12.

[2]  Huan Liu,et al.  Exploiting social relations for sentiment analysis in microblogging , 2013, WSDM.

[3]  Firoz Khan,et al.  Sentiment Analysis of Twitter Data , 2018, International Research Journal on Advanced Science Hub.

[4]  Andrea Esuli,et al.  SentiWordNet: A High-Coverage Lexical Resource for Opinion Mining , 2006 .

[5]  Helmut Schmid,et al.  Part-of-Speech Tagging With Neural Networks , 1994, COLING.

[6]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[7]  M. Bradley,et al.  Affective Norms for English Words (ANEW): Instruction Manual and Affective Ratings , 1999 .

[8]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[9]  Sasha Blair-Goldensohn,et al.  The viability of web-derived polarity lexicons , 2010, NAACL.

[10]  G. Gebremeskel Gebrekirstos Sentiment Analysis of Twitter Posts About news , 2011 .

[11]  Stefano M. Iacus,et al.  Using social media to forecast electoral results: a review of the state of the art , 2013 .

[12]  Vittorio Loreto,et al.  Opinion dynamics: models, extensions and external effects , 2016, Participatory Sensing, Opinions and Collective Awareness.

[13]  M. A. Muñoz,et al.  Nonlinear q-voter model. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[15]  Jason Baldridge,et al.  Twitter Polarity Classification with Label Propagation over Lexical Links and the Follower Graph , 2011, ULNLP@EMNLP.

[16]  Christopher M. Danforth,et al.  Measuring the Happiness of Large-Scale Written Expression: Songs, Blogs, and Presidents , 2010, ArXiv.

[17]  Andrea Esuli,et al.  Perception of social phenomena through the multidimensional analysis of online social networks , 2017, Online Soc. Networks Media.