论文信息 - Twitter Sentiment Analysis

Twitter Sentiment Analysis

This paper covers the two approaches for sentiment analysis: i) lexicon based method; ii) machine learning method. We describe several techniques to implement these approaches and discuss how they can be adopted for sentiment classification of Twitter messages. We present a comparative study of different lexicon combinations and show that enhancing sentiment lexicons with emoticons, abbreviations and social-media slang expressions increases the accuracy of lexicon-based classification for Twitter. We discuss the importance of feature generation and feature selection processes for machine learning sentiment classification. To quantify the performance of the main sentiment analysis methods over Twitter we run these algorithms on a benchmark Twitter dataset from the SemEval-2013 competition, task 2-B. The results show that machine learning method based on SVM and Naive Bayes classifiers outperforms the lexicon method. We present a new ensemble method that uses a lexicon based sentiment score as input feature for the machine learning approach. The combined method proved to produce more precise classifications. We also show that employing a cost-sensitive classifier for highly unbalanced datasets yields an improvement of sentiment classification performance up to 7%.

[1] Roberto Basili,et al. Language sensitive text classification , 2000, RIAO.

[2] Diego Reforgiato Recupero,et al. Sentiment Analysis: Adjectives and Adverbs are Better than Adjectives Alone , 2007, ICWSM.

[3] Saif Mohammad,et al. NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[4] Ellen Riloff,et al. Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.

[5] Thorsten Joachims,et al. Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[6] C. J. van Rijsbergen,et al. Information Retrieval , 1979, Encyclopedia of GIS.

[7] George Forman,et al. An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[8] Elisabetta Fersini,et al. Enhance User-Level Sentiment Analysis on Microblogs with Approval Relations , 2013, AI*IA.

[9] Uzay Kaymak,et al. Exploiting emoticons in sentiment analysis , 2013, SAC '13.

[10] Alexandre Plastino,et al. A Statistical and Evolutionary Approach to Sentiment Analysis , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[11] Stephen R. Marsland,et al. Machine Learning - An Algorithmic Perspective , 2009, Chapman and Hall / CRC machine learning and pattern recognition series.

[12] Brendan T. O'Connor,et al. Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments , 2010, ACL.

[13] Bernardo A. Huberman,et al. Predicting the Future with Social Media , 2010, Web Intelligence.

[14] Gerard Salton,et al. The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[15] Adam Kowalczyk,et al. Second Order Features for Maximising Text Classification Performance , 2001, ECML.

[16] Satoshi Morinaga,et al. Mining product reputations on the Web , 2002, KDD.

[17] Soo-Min Kim,et al. Determining the Sentiment of Opinions , 2004, COLING.

[18] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[19] S. Piantadosi. Zipf’s word frequency law in natural language: A critical review and future directions , 2014, Psychonomic Bulletin & Review.

[20] Philip J. Stone,et al. A computer approach to content analysis: studies using the General Inquirer system , 1963, AFIPS Spring Joint Computing Conference.

[21] Peter D. Turney. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[22] Elisabetta Fersini,et al. Expressive signals in social media languages to improve polarity detection , 2016, Inf. Process. Manag..