Improving Movie Gross Prediction through News Analysis

Traditional movie gross predictions are based on numerical and categorical movie data from The Internet Movie Database (IMDB). In this paper, we use the quantitative news data generated by Lydia, our system for large-scale news analysis, to help people to predict movie grosses. By analyzing two different models (regression and k-nearest neighbor models), we find models using only news data can achieve similar performance to those using IMDB data. Moreover, we can achieve better performance by using the combination of IMDB data and news data. Further, the improvement is statistically significant.

[1]  Lina Zhou,et al.  Movie Review Mining: a Comparison between Supervised and Unsupervised Classification Approaches , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[2]  Sofus A. Macskassy,et al.  More than Words: Quantifying Language to Measure Firms' Fundamentals the Authors Are Grateful for Assiduous Research Assistance from Jie Cao and Shuming Liu. We Appreciate Helpful Comments From , 2007 .

[3]  Ramesh Sharda,et al.  Predicting box-office success of motion pictures with neural networks , 2006 .

[4]  Wai Lam,et al.  Stock prediction: Integrating text mining approach using real-time news , 2003, 2003 IEEE International Conference on Computational Intelligence for Financial Engineering, 2003. Proceedings..

[5]  Rob J Hyndman,et al.  Another look at measures of forecast accuracy , 2006 .

[6]  J. Eliashberg,et al.  A Parsimonious Model for Forecasting Gross Box-Office Revenues of Motion Pictures , 1996 .

[7]  W. S. Chan,et al.  Stock Price Reaction to News and No-News: Drift and Reversal after Headlines , 2001 .

[8]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[9]  Steven Skiena,et al.  Lydia: A System for Large-Scale News Analysis , 2005, SPIRE.

[10]  Gilad Mishne,et al.  Predicting Movie Sales from Blogger Sentiment , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[11]  J. Simonoff,et al.  Predicting Movie Grosses: Winners and Losers, Blockbusters and Sleepers , 2000 .