Box office prediction based on microblog

As the importance and popularity of online social media has become more obvious, there are more researches aiming at making use of information from them. One important topic of this is predicting the future with social media. This paper focuses on predicting box offices using microblog. Compared with previous work which makes use of the count of related microblogs simply, the information from social media has been utilized more deeply in this paper. Two sets of features have been extracted: count based features and content based features. For the former, the information in the aspect of users, which decrease the influence of garbage microblogs, has been exploited. For content based features, a new box office oriented semantic classification method has been provided to make the features more relative with box offices. Meanwhile, more complex machine learning models such as SVM and neutral network have been applied to the prediction method. Our prediction model is more accurate and reliable. With our prediction method, the data in Tencent microblog has been utilized to predict box offices of certain movies in China. With the results, the strength of our method and predictive power of online social media can be completely demonstrated.

[1]  Eric Gilbert,et al.  Predicting tie strength with social media , 2009, CHI.

[2]  Le T. Nguyen,et al.  Predicting collective sentiment dynamics from time-series social media , 2012, WISDOM '12.

[3]  Yong Tan,et al.  Topic evolution prediction of user generated contents considering enterprise generated contents , 2012, HotSocial '12.

[4]  Sara Rosenthal,et al.  Age Prediction in Blogs: A Study of Style, Content, and Online Behavior in Pre- and Post-Social Media Generations , 2011, ACL.

[5]  M. de Rijke,et al.  Predicting IMDB Movie Ratings Using Social Media , 2012, ECIR.

[6]  Tad Hogg,et al.  Using a model of social dynamics to predict popularity of news , 2010, WWW '10.

[7]  Munmun De Choudhury Modeling and predicting group activity over time in online social media , 2009, HT '09.

[8]  Ponnurangam Kumaraguru,et al.  Credibility ranking of tweets during high impact events , 2012, PSOSM '12.

[9]  Min Chen,et al.  Predicting aggregate social activities using continuous-time stochastic process , 2012, CIKM.

[10]  Wei Wei,et al.  Correlating S&P 500 stocks with Twitter data , 2012, HotSocial '12.

[11]  Daniel Gayo-Avello,et al.  Don't turn social media into another 'Literary Digest' poll , 2011, Commun. ACM.

[12]  Xiaohui Yu,et al.  ARSA: a sentiment-aware model for predicting sales performance using blogs , 2007, SIGIR.

[13]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, Web Intelligence.

[14]  Gayo-AvelloDaniel Don't turn social media into another 'Literary Digest' poll , 2011 .

[15]  Tad Hogg,et al.  Using Stochastic Models to Describe and Predict Social Dynamics of Web Users , 2010, TIST.

[16]  Kristina Lerman,et al.  Using proximity to predict activity in social networks , 2011, WWW.

[17]  Ee-Peng Lim,et al.  Tweets and Votes: A Study of the 2011 Singapore General Election , 2012, 2012 45th Hawaii International Conference on System Sciences.

[18]  Brian D. Davison,et al.  Predicting popular messages in Twitter , 2011, WWW.

[19]  Huan Liu,et al.  Scalable learning of collective behavior based on sparse social dimensions , 2009, CIKM.

[20]  Hakim Hacid,et al.  A predictive model for the temporal dynamics of information diffusion in online social networks , 2012, WWW.

[21]  Ronen Feldman,et al.  Identifying and Following Expert Investors in Stock Microblogs , 2011, EMNLP.