论文信息 - Tweet modeling with LSTM recurrent neural networks for hashtag recommendation

Tweet modeling with LSTM recurrent neural networks for hashtag recommendation

The hash symbol, called a hashtag, is used to mark the keyword or topic in a tweet. It was created organically by users as a way to categorize messages. Hashtags also provide valuable information for many research applications such as sentiment classification and topic analysis. However, only a small number of tweets are manually annotated. Therefore, an automatic hashtag recommendation method is needed to help users tag their new tweets. Previous methods mostly use conventional machine learning classifiers such as SVM or utilize collaborative filtering technique. A bottleneck of these approaches is that they all use the TF-IDF scheme to represent tweets and ignore the semantic information in tweets. In this paper, we also regard hashtag recommendation as a classification task but propose a novel recurrent neural network model to learn vector-based tweet representations to recommend hashtags. More precisely, we use a skip-gram model to generate distributed word representations and then apply a convolutional neural network to learn semantic sentence vectors. Afterwards, we make use of the sentence vectors to train a long short-term memory recurrent neural network (LSTM-RNN). We directly use the produced tweet vectors as features to classify hashtags without any feature engineering. Experiments on real world data from Twitter to recommend hashtags show that our proposed LSTM-RNN model outperforms state-of-the-art methods and LSTM unit also obtains the best performance compared to standard RNN and gated recurrent unit (GRU).

[1] Xuanjing Huang,et al. Learning Topical Translation Model for Microblog Hashtag Suggestion , 2013, IJCAI.

[2] A. Mazzia. Suggesting Hashtags on Twitter , 2011 .

[3] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.

[4] Ari Rappoport,et al. Enhanced Sentiment Learning Using Twitter Hashtags and Smileys , 2010, COLING.

[5] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[6] Ting Liu,et al. Learning Semantic Representations of Users and Products for Document Level Sentiment Classification , 2015, ACL.

[7] Jun Zhao,et al. How to Generate a Good Word Embedding , 2015, IEEE Intelligent Systems.

[8] Lukás Burget,et al. Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[10] Aixin Sun,et al. Hashtag recommendation for hyperlinked tweets , 2014, SIGIR.

[11] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[12] Derek Greene,et al. Practical solutions to the problem of diagonal dominance in kernel document clustering , 2006, ICML.

[13] Houfeng Wang,et al. Entity-centric topic-oriented opinion summarization in twitter , 2012, KDD.

[14] Yiqun Liu,et al. Discover breaking events with popular hashtags in twitter , 2012, CIKM.

[15] Eva Zangerle,et al. Recommending #-Tags in Twitter , 2011 .

[16] Wesley De Neve,et al. Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network , 2014, 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[17] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[18] Petr Sojka,et al. Software Framework for Topic Modelling with Large Corpora , 2010 .

[19] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[20] Ee-Peng Lim,et al. On Recommending Hashtags in Twitter Networks , 2012, SocInfo.