Enhanced Sentiment Learning Using Twitter Hashtags and Smileys

Automated identification of diverse sentiment types can be beneficial for many NLP systems such as review summarization and public media analysis. In some of these systems there is an option of assigning a sentiment value to a single sentence or a very short text. In this paper we propose a supervised sentiment classification framework which is based on data from Twitter, a popular microblogging service. By utilizing 50 Twitter tags and 15 smileys as sentiment labels, this framework avoids the need for labor intensive manual annotation, allowing identification and classification of diverse sentiment types of short texts. We evaluate the contribution of different feature types for sentiment classification and show that our framework successfully identifies sentiment types of untagged sentences. The quality of the sentiment identification was also confirmed by human judges. We also explore dependencies and overlap between different sentiment types represented by smileys and Twitter hashtags.

[1]  Janyce Wiebe,et al.  Learning Subjective Adjectives from Corpora , 2000, AAAI/IAAI.

[2]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[3]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[4]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[5]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.

[6]  Shlomo Argamon,et al.  Using appraisal groups for sentiment analysis , 2005, CIKM '05.

[7]  G. Mishne Experiments with Mood Classification in , 2005 .

[8]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[9]  Gilad Mishne,et al.  Why Are They Excited? Identifying and Explaining Spikes in Blog Mood Levels , 2006, EACL.

[10]  Ari Rappoport,et al.  Efficient Unsupervised Discovery of Word Categories Using Symmetric Patterns and High Frequency Words , 2006, ACL.

[11]  Hugo Liu,et al.  A Corpus-based Approach to Finding Happiness , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[12]  Rada Mihalcea,et al.  Word Sense and Subjectivity , 2006, ACL.

[13]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[14]  Sabine Bergler,et al.  Semantic Tag Extraction from WordNet Glosses , 2006, LREC.

[15]  Shlomo Argamon,et al.  Extracting Appraisal Expressions , 2007, NAACL.

[16]  Mike Wells,et al.  Structured Models for Fine-to-Coarse Sentiment Analysis , 2007, ACL.

[17]  Ari Rappoport,et al.  Unsupervised Discovery of Generic Relationships Using Pattern Clusters and its Evaluation by Automatically Generated SAT Analogy Questions , 2008, ACL.

[18]  Ivan Titov,et al.  A Joint Model of Text and Aspect Ratings for Sentiment Summarization , 2008, ACL.

[19]  Carlo Strapparava,et al.  Learning to identify emotions in text , 2008, SAC '08.

[20]  Ivan Titov,et al.  Modeling online reviews with multi-grain topic models , 2008, WWW.

[21]  Prem Melville,et al.  Sentiment analysis of blogs by combining lexical knowledge with text classification , 2009, KDD.

[22]  Janyce Wiebe,et al.  Articles: Recognizing Contextual Polarity: An Exploration of Features for Phrase-Level Sentiment Analysis , 2009, CL.

[23]  J. Wiebe Subjectivity Word Sense Disambiguation , 2009, EMNLP 2009.

[24]  Bernard J. Jansen,et al.  Twitter power: Tweets as electronic word of mouth , 2009, J. Assoc. Inf. Sci. Technol..

[25]  Ari Rappoport,et al.  Semi-Supervised Recognition of Sarcasm in Twitter and Amazon , 2010, CoNLL.

[26]  Ari Rappoport,et al.  ICWSM - A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews , 2010, ICWSM.