Social media such as Twitter, Google+, Facebook, etc has an undeniable effect on the way information is stored and processed by us. The information available on the web is abound and hence it is essential to mine the important information and avoid the irrelevant details. Along with this, it is beneficial to consider information that is contextually similar to information related to a particular topic as it provides a big picture. Tweets contains keywords known as hashtags which provide useful information for the purpose of sentiment analysis, named entity recognition, event detection, etc. In this paper, we have analyzed Twitter data based on their hashtags, which is widely used nowadays. We have extracted tweets pertaining to a single keyword and to contextually similar keywords. For the purpose of finding similar words we have used word embeddings that capture contextual information successfully. We have used topic modeling to expose the latent structure of the documents based on probability distribution. The proposed framework helps user to find relevant tweets pertaining to a specific and to contextually similar hashtags.
[1]
Qi He,et al.
Tweet Segmentation and Its Application to Named Entity Recognition
,
2015,
IEEE Transactions on Knowledge and Data Engineering.
[2]
Yang Liu,et al.
Who Influenced You? Predicting Retweet via Social Influence Locality
,
2015,
ACM Trans. Knowl. Discov. Data.
[3]
Petr Sojka,et al.
Software Framework for Topic Modelling with Large Corpora
,
2010
.
[4]
Jeffrey Pennington,et al.
GloVe: Global Vectors for Word Representation
,
2014,
EMNLP.
[5]
Yoshua Bengio,et al.
Word Representations: A Simple and General Method for Semi-Supervised Learning
,
2010,
ACL.
[6]
Sanjay Singh,et al.
Is That Twitter Hashtag Worth Reading
,
2015,
WCI '15.
[7]
이주연,et al.
Latent Dirichlet Allocation (LDA) 모델 기반의 인공지능(A.I.) 기술 관련 연구 활동 및 동향 분석
,
2018
.
[8]
David B. Dunson,et al.
Probabilistic topic models
,
2012,
Commun. ACM.
[9]
Omar Boussaïd,et al.
Real-time trending topics detection and description from Twitter content
,
2015,
Social Network Analysis and Mining.