论文信息 - Twitter Sentiment Analysis

Twitter Sentiment Analysis

This paper covers the two approaches for sentiment analysis: i) lexicon based method; ii) machine learning method. We describe several techniques to implement these approaches and discuss how they can be adopted for sentiment classification of Twitter messages. We present a comparative study of different lexicon combinations and show that enhancing sentiment lexicons with emoticons, abbreviations and social-media slang expressions increases the accuracy of lexicon-based classification for Twitter. We discuss the importance of feature generation and feature selection processes for machine learning sentiment classification. To quantify the performance of the main sentiment analysis methods over Twitter we run these algorithms on a benchmark Twitter dataset from the SemEval-2013 competition, task 2-B. The results show that machine learning method based on SVM and Naive Bayes classifiers outperforms the lexicon method. We present a new ensemble method that uses a lexicon based sentiment score as input feature for the machine learning approach. The combined method proved to produce more precise classifications. We also show that employing a cost-sensitive classifier for highly unbalanced datasets yields an improvement of sentiment classification performance up to 7%.

[1] Yuhai Wu,et al. Statistical Learning Theory , 2021, Technometrics.

[2] Philip C. Treleaven,et al. Twitter Sentiment Analysis Applied to Finance: A Case Study in the Retail Industry , 2015, ArXiv.

[3] Elisabetta Fersini,et al. Sentiment analysis: Bayesian Ensemble Learning , 2014, Decis. Support Syst..

[4] Nikolaos Korfiatis,et al. Mining of Massive Datasets , 2014 .

[5] Alexandre Plastino,et al. A Statistical and Evolutionary Approach to Sentiment Analysis , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[6] S. Piantadosi. Zipf’s word frequency law in natural language: A critical review and future directions , 2014, Psychonomic Bulletin & Review.

[7] Michael J. Cafarella,et al. Using Social Media to Measure Labor Market Flows , 2014 .

[8] Kristina Lerman,et al. Tripartite graph clustering for dynamic sentiment analysis on social media , 2014, SIGMOD Conference.

[9] Elisabetta Fersini,et al. Enhance User-Level Sentiment Analysis on Microblogs with Approval Relations , 2013, AI*IA.

[10] David Zimbra,et al. Twitter brand sentiment analysis: A hybrid system using n-gram analysis and dynamic artificial neural network , 2013, Expert Syst. Appl..

[11] Saif Mohammad,et al. NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[12] Jiebo Luo,et al. Towards social imagematics: sentiment analysis in social multimedia , 2013, MDMKDD '13.

[13] Veselin Stoyanov,et al. SemEval-2013 Task 2: Sentiment Analysis in Twitter , 2013, *SEMEVAL.

[14] Vivek Narayanan,et al. Fast and Accurate Sentiment Classification Using an Enhanced Naive Bayes Model , 2013, IDEAL.

[15] Uzay Kaymak,et al. Exploiting emoticons in sentiment analysis , 2013, SAC '13.

[16] Huan Liu,et al. Exploiting social relations for sentiment analysis in microblogging , 2013, WSDM.

[17] Harith Alani,et al. Semantic Sentiment Analysis of Twitter , 2012, SEMWEB.

[18] Ke Xu,et al. MoodLens: an emoticon-based sentiment analysis system for chinese tweets , 2012, KDD.

[19] Minyi Guo,et al. Emoticon Smoothed Language Models for Twitter Sentiment Analysis , 2012, AAAI.

[20] George Papadakis,et al. Content vs. context for sentiment analysis: a comparative analysis over microblogs , 2012, HT '12.

[21] Johanna D. Moore,et al. Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[22] Maite Taboada,et al. Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[23] Finn Årup Nielsen,et al. A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.

[24] Johan Bollen,et al. Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[25] Junlan Feng,et al. Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[26] Isaac G. Councill,et al. What's great and what's not: learning to classify the scope of negation for improved sentiment analysis , 2010, NeSp-NLP@ACL.

[27] Bernardo A. Huberman,et al. Predicting the Future with Social Media , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[28] Brendan T. O'Connor,et al. Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments , 2010, ACL.

[29] Philip S. Yu,et al. A holistic lexicon-based approach to opinion mining , 2008, WSDM '08.

[30] Sotiris B. Kotsiantis,et al. Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[31] Masaru Kitsuregawa,et al. Building Lexicon for Sentiment Analysis from Massive Collection of HTML Documents , 2007, EMNLP.

[32] Janyce Wiebe,et al. Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[33] Ellen Riloff,et al. Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.

[34] Soo-Min Kim,et al. Determining the Sentiment of Opinions , 2004, COLING.

[35] Dell Zhang,et al. Question classification using support vector machines , 2003, SIGIR.

[36] Jörg Kindermann,et al. Authorship Attribution with Support Vector Machines , 2003, Applied Intelligence.

[37] George Forman,et al. An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[38] FormanGeorge. An extensive empirical study of feature selection metrics for text classification , 2003 .

[39] Janyce Wiebe,et al. Learning to Disambiguate Potentially Subjective Expressions , 2002, CoNLL.