Combining a rule-based classifier with weakly supervised learning for twitter sentiment analysis

Microblog, especially Twitter, have become an integral part of our daily life, where millions of user sharing their thoughts daily because of its short length characteristics and simple manner of expression. Monitoring and analyzing sentiments from such massive amount of twitter posts provide enormous opportunities for companies and other organizations to learn about what user think and feel about their products and services. But the ever-growing unstructured and informal user-generated posts in twitter demands sentiment analysis tools that can perform well with minimum supervision. In this paper, we propose an approach for sentiment analysis on twitter, where we combine a rule-based classifier with weakly supervised Naive-Bayes classifier. To classify the tweets sentiment, we introduce a set of rules for the rule-based classifier based on the occurrences of emoticons and sentiment-bearing words, whereas several sentiment lexicons are applied to train the Naive-Bayes classifier. We conducted our experiments based on the Stanford sentiment140 dataset. Experimental results demonstrate the effectiveness of our method over the baseline in terms of recall, precision, F1 score, and accuracy.

[1]  Fei Liu,et al.  A Broad-Coverage Normalization System for Social Media Language , 2012, ACL.

[2]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[3]  Abu Nowshed Chy,et al.  Bangla news classification using naive Bayes classifier , 2014, 16th Int'l Conf. Computer and Information Technology.

[4]  Mike Thelwall,et al.  Sentiment strength detection for the social web , 2012, J. Assoc. Inf. Sci. Technol..

[5]  Davide Buscaldi,et al.  Sentiment Analysis on Microblogs for Natural Disasters Management: a Study on the 2014 Genoa Floodings , 2015, WWW.

[6]  Mike Thelwall,et al.  Twitter, MySpace, Digg: Unsupervised Sentiment Analysis in Social Media , 2012, TIST.

[7]  Omaima Almatrafi,et al.  Application of location-based sentiment analysis using Twitter for identifying trends towards Indian general elections 2014 , 2015, IMCOM.

[8]  Marcelo Mendoza,et al.  Combining strengths, emotions and polarities for boosting Twitter sentiment analysis , 2013, WISDOM '13.

[9]  Junlan Feng,et al.  Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[10]  Saif Mohammad,et al.  CROWDSOURCING A WORD–EMOTION ASSOCIATION LEXICON , 2013, Comput. Intell..

[11]  Janyce Wiebe,et al.  Lexical Acquisition for Opinion Inference: A Sense-Level Lexicon of Benefactive and Malefactive Events , 2014, WASSA@ACL.

[12]  Xiaolong Wang,et al.  Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach , 2011, CIKM '11.

[13]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[14]  Huan Liu,et al.  Unsupervised sentiment analysis with emotional signals , 2013, WWW.

[15]  Ronen Feldman,et al.  Techniques and applications for sentiment analysis , 2013, CACM.

[16]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[17]  Timothy Baldwin,et al.  Automatically Constructing a Normalisation Dictionary for Microblogs , 2012, EMNLP.