Methodology for Twitter Sentiment Analysis

This paper presents a step-by-step methodology for Twitter sentiment analysis. Two approaches are tested to measure variations in the public opinion about retail brands. The first, a lexicon-based method, uses a dictionary of words with assigned to them semantic scores to calculate a final polarity of a tweet, and incorporates part of speech tagging. The second, machine learning approach, tackles the problem as a text classification task employing two supervised classifiers Naive Bayes and Support Vector Machines. We show that combining the lexicon and machine learning approaches by using a lexicon score as a one of the features in Naive Bayes and SVM classifications improves the accuracy of classification by 5%.

[1]  Huan Liu,et al.  Exploiting social relations for sentiment analysis in microblogging , 2013, WSDM.

[2]  Karo Moilanen,et al.  Sentiment Composition , 2007 .

[3]  Roberto Basili,et al.  Language sensitive text classification , 2000, RIAO.

[4]  Adam Kowalczyk,et al.  Second Order Features for Maximising Text Classification Performance , 2001, ECML.

[5]  Satoshi Morinaga,et al.  Mining product reputations on the Web , 2002, KDD.

[6]  Jörg Kindermann,et al.  Authorship Attribution with Support Vector Machines , 2003, Applied Intelligence.

[7]  Masaru Kitsuregawa,et al.  Building Lexicon for Sentiment Analysis from Massive Collection of HTML Documents , 2007, EMNLP.

[8]  Kathleen R. McKeown,et al.  Predicting the semantic orientation of adjectives , 1997 .

[9]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[10]  Preslav Nakov,et al.  SemEval-2013 Task 2: Sentiment Analysis in Twitter , 2013, *SEMEVAL.

[11]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[12]  Janyce Wiebe,et al.  Learning Subjective Adjectives from Corpora , 2000, AAAI/IAAI.

[13]  Janyce Wiebe,et al.  Learning to Disambiguate Potentially Subjective Expressions , 2002, CoNLL.

[14]  David E. Johnson,et al.  Maximizing Text-Mining Performance , 1999 .

[15]  Kristina Lerman,et al.  Tripartite graph clustering for dynamic sentiment analysis on social media , 2014, SIGMOD Conference.

[16]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[17]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[18]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[19]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[20]  Ellen Riloff,et al.  Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.

[21]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[22]  Brendan T. O'Connor,et al.  Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments , 2010, ACL.

[23]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[24]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[25]  Brendan T. O'Connor,et al.  Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics , 2011 .

[26]  Bruno Ohana,et al.  Sentiment Classification of Reviews Using SentiWordNet , 2009 .

[27]  Shi Bing,et al.  Inductive learning algorithms and representations for text categorization , 2006 .

[28]  Junlan Feng,et al.  Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[29]  Uzay Kaymak,et al.  Exploiting emoticons in sentiment analysis , 2013, SAC '13.

[30]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.

[31]  Alexandre Plastino,et al.  A Statistical and Evolutionary Approach to Sentiment Analysis , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[32]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, Web Intelligence.

[33]  Saif Mohammad,et al.  NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[34]  L. Ladha,et al.  FEATURE SELECTION METHODS AND ALGORITHMS , 2011 .

[35]  Jiebo Luo,et al.  Towards social imagematics: sentiment analysis in social multimedia , 2013, MDMKDD '13.

[36]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[37]  Harith Alani,et al.  Semantic Sentiment Analysis of Twitter , 2012, SEMWEB.

[38]  Philip C. Treleaven,et al.  Twitter Sentiment Analysis Applied to Finance: A Case Study in the Retail Industry , 2015, ArXiv.

[39]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[40]  Victor S. Sheng,et al.  Cost-Sensitive Learning and the Class Imbalance Problem , 2008 .

[41]  Paul S. Jacobs,et al.  Joining Statistics with NLP for Text Categorization , 1992, ANLP.

[42]  Vaibhavi N Patodkar,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2016 .

[43]  Stan Matwin,et al.  A learner-independent evaluation of the usefulness of statistical phrases for automated text categorization , 2001 .

[44]  George Papadakis,et al.  Content vs. context for sentiment analysis: a comparative analysis over microblogs , 2012, HT '12.

[45]  Yuan-Fang Wang,et al.  The use of bigrams to enhance text categorization , 2002, Inf. Process. Manag..

[46]  Finn Årup Nielsen,et al.  A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.