论文信息 - Efficient Density Based Clustering of Tweets and Sentimental Analysis Based on Segmentation

Efficient Density Based Clustering of Tweets and Sentimental Analysis Based on Segmentation

Twitter has become popular social networking site where users share their up-to-date information. The error-prone and short nature of tweets makes the word-based representation less reliable. Tweet segmentation is the process of splitting tweets into meaning segments so that its semantic meaning is well conserved and is easy to be used by downstream applications. Segmentation is done based on stickiness score considering both global and local context. Clustering of tweets are done using DBSCAN method with Jaccard Coefficient as the similarity measure. The sentimental variations in tweets are measured based on segmentation. The experimental evaluation shows that the global terms using wikilinks are more efficient than the normal segmentation. Clustering is more effective using DBSCAN algorithm, which is best for uncertain data.

Rose V Pattani

[1] Nihalahmad R. Shikalgar,et al. JIBCA: Jaccard Index based Clustering Algorithm for Mining Online Review , 2014 .

[2] Oren Etzioni,et al. Named Entity Recognition in Tweets: An Experimental Study , 2011, EMNLP.

[3] Bu-Sung Lee,et al. TwiNER: named entity recognition in targeted twitter stream , 2012, SIGIR '12.

[4] Yang Li,et al. Interpreting the Public Sentiment Variations on Twitter , 2014, IEEE Transactions on Knowledge and Data Engineering.

[5] Minyi Guo,et al. Emoticon Smoothed Language Models for Twitter Sentiment Analysis , 2012, AAAI.

[6] Qi He,et al. Exploiting hybrid contexts for Tweet segmentation , 2013, SIGIR.

[7] Ming Zhou,et al. Recognizing Named Entities in Tweets , 2011, ACL.

[8] Qi He,et al. Tweet Segmentation and Its Application to Named Entity Recognition , 2015, IEEE Transactions on Knowledge and Data Engineering.

[9] Bin Jiang,et al. Clustering Uncertain Data Based on Probability Distribution Similarity , 2013, IEEE Transactions on Knowledge and Data Engineering.

[10] Stephen Clark,et al. A Fast Decoder for Joint Word Segmentation and POS-Tagging Using a Single Discriminative Model , 2010, EMNLP.