Sentiment Analysis of Twitter Posts About news

The thesis set out to solve a practical problem of sentiment analysis of Twitter posts about news. The thesis has made contributions is data collection of tweets about news, empirical study of the role of context in sentiment analysis of tweets about news, and best feature selection. Test data was collected by a bootstrapping approach where some tweets that contain some number of words from the news headline were used to extract links and to obtain more tweets about news from Twitter. The test data was manually inspected and annotated for its sentiment to see if context plays role in determining sentiment of a tweet. Uni-gram+bi-gram was selected as a feature that captures two important features of the data: uni-grams provides better coverage of the data, and bi-grams capture sentiment expression patterns. The thesis has shown that tweets about news can be automatically collected and successfully analyzed for their sentiment. Multinomial Naive Bayes classifier using uni-gram+bi-gram presence was found to give the highest accuracy, an accuracy of 87.78% for a three-classed classification and an accuracy of 90.79% for the two-classed (subjective, objective) classifier derived from the three-classed classifier. The accuracies of the classifier on both three-classed and two-classed classification is impressive and can be applied for practical applications dealing with sentiment analysis of tweets.

[1]  Alex Wright Our sentiments, exactly , 2009, CACM.

[2]  Steven Skiena,et al.  Large-Scale Sentiment Analysis for News and Blogs (system demonstration) , 2007, ICWSM.

[3]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[4]  Syin Chan,et al.  Effectiveness of Simple Linguistic Processing in Automatic Sentiment Classification of Product Reviews , 2004 .

[5]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[6]  Prem Melville,et al.  Sentiment analysis of blogs by combining lexical knowledge with text classification , 2009, KDD.

[7]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[8]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[9]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[10]  Kathleen R. McKeown,et al.  Predicting the semantic orientation of adjectives , 1997 .

[11]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[12]  Jonathon Read,et al.  Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification , 2005, ACL.

[13]  Rudy Prabowo,et al.  Sentiment analysis: A combined approach , 2009, J. Informetrics.

[14]  Junlan Feng,et al.  Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[15]  Steven Skiena,et al.  International Sentiment Analysis for News and Blogs , 2021, ICWSM.

[16]  Luo Si,et al.  Knowledge Transfer and Opinion Detection in the TREC2006 Blog Track , 2006 .

[17]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.