Word Segmentation Algorithms with Lexical Resources for Hashtag Classification

We present a novel method for classifying hashtag types. Specifically, we apply word segmentation algorithms and lexical resources in order to classify two types of hashtags: those with sentiment information and those without. However, the complex structure of hashtags increases the difficulty of identifying sentiment information. In order to solve this problem, we segment hashtags into smaller semantic units using word segmentation algorithms in conjunction with lexical resources to classify hashtag types. Our experimental results demonstrate that our approach achieves a 14% increase in accuracy over baseline methods for identifying hashtags with sentiment information. Additionally, we achieve over 94% recall using this hashtag type for the subjectivity detection of tweets.

[1]  Man Lan,et al.  ECNU: Expression- and Message-level Sentiment Orientation Classification in Twitter Using Multiple Effective Features , 2014, *SEMEVAL.

[2]  M. Thelwall Heart and Soul : Sentiment Strength Detection in the Social Web with , 2013 .

[3]  Philip J. Stone,et al.  Extracting Information. (Book Reviews: The General Inquirer. A Computer Approach to Content Analysis) , 1967 .

[4]  Xiaolong Wang,et al.  Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach , 2011, CIKM '11.

[5]  Saif Mohammad,et al.  Sentiment Analysis of Short Informal Texts , 2014, J. Artif. Intell. Res..

[6]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[7]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[8]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[9]  Finn Årup Nielsen,et al.  A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.

[10]  Rudrasis Chakraborty,et al.  Segmenting web-domains and hashtags using length specific models , 2012, CIKM '12.

[11]  Marshall S. Smith,et al.  The general inquirer: A computer approach to content analysis. , 1967 .

[12]  Felipe Bravo-Marquez,et al.  Meta-level sentiment models for big social data analysis , 2014, Knowl. Based Syst..

[13]  Pablo Gervás,et al.  SentiSense: An easily scalable concept-based affective lexicon for sentiment analysis , 2012, LREC.

[14]  Rajiv Ramnath,et al.  Towards building large-scale distributed systems for twitter sentiment analysis , 2012, SAC '12.

[15]  Giacomo Berardi,et al.  ISTI@TREC Microblog Track 2011: Exploring the Use of Hashtag Segmentation and Text Quality Ranking , 2011, TREC.

[16]  Hideaki Takeda,et al.  Hashtag Popularity on Twitter: Analyzing Co-occurrence of Multiple Hashtags , 2015, HCI.

[17]  Ari Rappoport,et al.  What's in a hashtag?: content based prediction of the spread of ideas in microblogging communities , 2012, WSDM '12.

[18]  G. R. Nudd,et al.  The Viterbi Algorithm , 1993 .

[19]  Robert J. Hilderman,et al.  Using Combined Lexical Resources to Identify Hashtag Types , 2015, WASSA@EMNLP.

[20]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[21]  Marcelo Mendoza,et al.  Combining strengths, emotions and polarities for boosting Twitter sentiment analysis , 2013, WISDOM '13.

[22]  Vasudeva Varma,et al.  Mining Sentiments from Tweets , 2012, WASSA@ACL.

[23]  Ellen Riloff,et al.  Learning Emotion Indicators from Tweets: Hashtags, Hashtag Patterns, and Phrases , 2014, EMNLP.

[24]  Ellen Riloff,et al.  Bootstrapped Learning of Emotion Hashtags #hashtags4you , 2013, WASSA@NAACL-HLT.

[25]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL 2006.

[26]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[27]  Robert J. Hilderman,et al.  Evaluating the Effectiveness of Hashtags as Predictors of the Sentiment of Tweets , 2015, Discovery Science.

[28]  Junlan Feng,et al.  Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[29]  Saif Mohammad,et al.  #Emotional Tweets , 2012, *SEMEVAL.

[30]  Saif Mohammad,et al.  NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[31]  Vasudeva Varma,et al.  Towards Deep Semantic Analysis of Hashtags , 2015, ECIR.