Automatic Hashtag Recommendation in Social Networking and Microblogging Platforms Using a Knowledge-Intensive Content-based Approach

In social networking/microblogging environments, #tag is often used for categorizing messages and marking their key points. Also, since some social networks such as twitter apply restrictions on the number of characters in messages, #tags can serve as a useful tool for helping users express their messages. In this paper, a new knowledge-intensive content-based #tag recommendation system is introduced. The proposed system works by integrating structured knowledge in every core component. First, the relevant features, semantic structures and information-content are extracted from messages. Since little information can often be placed in a message, a content enrichment module is introduced to identify information structures that can improve the representation of message. The extracted features are represented by semantic network. Then, a hybrid and multi-layered similarity module identifies the commonalities and differences of the features, semantics and information-content in messages. At the end, #tags are recommended to users based on #tags in contextually similar messages. The system is evaluated on Tweets2011 dataset. The results suggests that the proposed method can recommend suitable #tags in negligible operational time and when little content is available.

[1]  Xuanjing Huang,et al.  Learning Topical Translation Model for Microblog Hashtag Suggestion , 2013, IJCAI.

[2]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[3]  J. Fernando Sánchez-Rada,et al.  Enhancing deep learning sentiment analysis with ensemble techniques in social applications , 2020 .

[4]  Jia Li,et al.  Tweet modeling with LSTM recurrent neural networks for hashtag recommendation , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[5]  Xuanjing Huang,et al.  Phrase-based hashtag recommendation for microblog posts , 2016, Science China Information Sciences.

[6]  Mark A. Finlayson Java Libraries for Accessing the Princeton Wordnet: Comparison and Evaluation , 2014, GWC.

[7]  S. Izadi,et al.  Use of Generalized Language Model for Question Matching , 2013 .

[8]  Hal Daumé,et al.  Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[9]  Tony Veale,et al.  An Intrinsic Information Content Metric for Semantic Similarity in WordNet , 2004, ECAI.

[10]  Gao Cong,et al.  Tagging Your Tweets: A Probabilistic Modeling of Hashtag Annotation in Twitter , 2014, CIKM.

[11]  Stefano Faralli,et al.  Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction and Disambiguation , 2017, EACL.

[12]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[13]  Mohand Boughanem,et al.  A concept-based approach for indexing documents in IR , 2005, INFORSID.

[14]  Qi Zhang,et al.  Hashtag Recommendation Using Attention-Based Convolutional Neural Network , 2016, IJCAI.

[15]  Lingling Meng,et al.  A Review of Semantic Similarity Measures in WordNet 1 , 2013 .

[16]  Jason Weston,et al.  #TagSpace: Semantic Embeddings from Hashtags , 2014, EMNLP.

[17]  Ee-Peng Lim,et al.  On Recommending Hashtags in Twitter Networks , 2012, SocInfo.

[18]  Ted Pedersen,et al.  Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text , 2013, J. Biomed. Informatics.

[19]  Hila Becker,et al.  Learning similarity metrics for event identification in social media , 2010, WSDM '10.

[20]  Eva Zangerle,et al.  Recommending #-Tags in Twitter , 2011 .

[21]  Jyrki Wallenius,et al.  Semantic Content Filtering with Wikipedia and Ontologies , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[22]  Alan F. Smeaton,et al.  Classifying sentiment in microblogs: is brevity an advantage? , 2010, CIKM.

[23]  David Sánchez,et al.  Ontology-based semantic similarity: A new feature-based approach , 2012, Expert Syst. Appl..

[24]  Rushed Kanawati,et al.  A graph-based meta-approach for tag recommendation , 2016, COMPLEX NETWORKS.

[25]  El Habib Nfaoui,et al.  Using Tweets Embeddings For Hashtag Recommendation in Twitter , 2018 .

[26]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[27]  H. Hamidi,et al.  Analysis and Evaluation of Privacy Protection Behavior and Information Disclosure Concerns in Online Social Networks , 2018, International Journal of Engineering.

[28]  Scott A. Wallace,et al.  Design and evaluation of a Twitter hashtag recommendation system , 2014, IDEAS.

[29]  David Carmel,et al.  Mining expertise and interests from social media , 2013, WWW.

[30]  Hua Xu,et al.  Suggest what to tag: Recommending more precise hashtags based on users' dynamic interests and streaming tweet content , 2016, Knowl. Based Syst..

[31]  Eneko Agirre,et al.  SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation , 2016, *SEMEVAL.

[32]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[33]  M. Anand Kumar,et al.  Entity Extraction for Malayalam Social Media Text Using Structured Skip-gram Based Embedding Features from Unlabeled Data , 2016 .

[34]  Mohammed Al-Dhelaan,et al.  Graph Summarization for Hashtag Recommendation , 2015, 2015 3rd International Conference on Future Internet of Things and Cloud.

[35]  Kevin Gimpel,et al.  Towards Universal Paraphrastic Sentence Embeddings , 2015, ICLR.

[36]  Qiang Zhou,et al.  CSE: Conceptual Sentence Embeddings based on Attention Model , 2016, ACL.

[37]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[38]  Yalou Huang,et al.  What to Tag Your Microblog: Hashtag Recommendation Based on Topic Analysis and Collaborative Filtering , 2014, APWeb.

[39]  Wesley De Neve,et al.  Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network , 2014, 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[40]  Zhiyuan Liu,et al.  Topical Word Trigger Model for Keyphrase Extraction , 2012, COLING.

[41]  Peter Kolb,et al.  DISCO: A Multilingual Database of Distributionally Similar Words , 2008 .

[42]  Xuanjing Huang,et al.  Hashtag recommendation for multimodal microblog posts , 2018, Neurocomputing.