Predicting popular messages in Twitter

Social network services have become a viable source of information for users. In Twitter, information deemed important by the community propagates through retweets. Studying the characteristics of such popular messages is important for a number of tasks, such as breaking news detection, personalized message recommendation, viral marketing and others. This paper investigates the problem of predicting the popularity of messages as measured by the number of future retweets and sheds some light on what kinds of factors influence information propagation in Twitter. We formulate the task into a classification problem and study two of its variants by investigating a wide spectrum of features based on the content of the messages, temporal information, metadata of messages and users, as well as structural properties of the users' social graph on a large scale dataset. We show that our method can successfully predict messages which will attract thousands of retweets with good performance.