A Classification Method for Micro-Blog Popularity Prediction: Considering the Semantic Information

Predicting the scale and quantity of reposting in micro-blog network have significances to the future network marketing, hot topic detection and public opinion monitor. This study proposed a novel two-stage method to predict the popularity of a micro-blog prior to its release. By focusing on the text content of the specific micro-blog as well as its source of publication (user’s attributes), a special classification method—Labeled Latent Dirichlet allocation (LLDA) was trained to predict the volume range of future reposts for a new message. To the authors’ knowledge, this paper is the first research to utilize this multi-label text classifier to investigate the influence of one micro-blog’s topic on its reposting scale. The experiment was conducted on a large scale dataset, and the results show that it’s possible to estimate ranges of popularity with an overall accuracy of 72.56%.