Popularity prediction in microblog based on LR-DT

Microblog is one of the most influential social media platforms. Timely prediction of the popular tweet in microblog is of great value in monitoring emergency monitoring, public opinions, personalized recommendations, marketing and other areas. This paper presents our improved method for predicting popularity of tweet in microblog. Firstly, we propose some new dynamic features, such as retweet depth, retweet width and the total fans' number of the forwarders, to improve prediction performance. Secondly, we propose an efficient algorithm LR-DT, which is based on the linear regression and the decision tree, to detect the popularity in early time. We first use the selected feature space to train a decision tree, after that for each new tweet we use the linear regression algorithm to predict the value of dynamic features after the tweet transferred an hour later, at last we use the decision tree classifier to predict the popularity of the tweet by the predicted features and some static features. The experiments are conducted on the real data set from Microblog, and the results showed that the proposed method can significantly reduce the time to identify the popularity of tweet and keep the accuracy at meantime.