Abstract This paper explores the effectiveness of social network analysis and sentiment analysis in predicting trends. Our research focuses on predicting the success of new movies over their first four weeks in the box office after opening. Specifically, we try to predict prices on the Hollywood Stock Exchange (HSX), a prediction market on movie gross income, and predict the ratio of gross income to production budget. When predicting movie stock values on HSX, we consider two different approaches. One approach is to predict the daily changes in prices. This means we would be predicting a mix of not only how well we think the movie will perform, but also how we think other people think the movie will perform. Our second approach is to predict the final closing price of the stock, which will be how much the movie actually grosses in the box office after four weeks. In this approach, the daily prices provide feedback with the crowd's constantly revising estimate of the final performance of the movie. Finally, we try to classify movies in three groups depending on whether they gross less than, just over, or a lot more than their production cost. For our prediction we gather three types of metrics. (1) Web Metrics are movie-rating metrics from IMDb and Rotten Tomatoes as well as box office performance data from Box Office Mojo and movie quotes from HSX. (2) SNA Metrics Web and blog betweenness represent the general buzz on the movie from the web and from bloggers. We hypothesize that they will be useful because they are unconscious signals about a movie's popularity. (3) To determine the general sentiment about the movies, we gather posts from IMDb forums to generate Sentiment Metrics for positivity and negativity based on the discussion in the forums. Our preliminary results employing different prediction methods such as multilinear and non-linear regression combining our three types of independent variables are encouraging, as we have been able to predict final box office return at least as good as the participants in the HSX prediction market.
[1]
Sergey Brin,et al.
The Anatomy of a Large-Scale Hypertextual Web Search Engine
,
1998,
Comput. Networks.
[2]
Tom A. B. Snijders,et al.
Social Network Analysis
,
2011,
International Encyclopedia of Statistical Science.
[3]
Peter A. Gloor,et al.
TeCFlow – A Temporal Communication Flow Visualizer for Social Network Analysis
,
2004
.
[4]
Takeshi Fukuda,et al.
Mining Structured Association Patterns from Databases
,
2000,
PAKDD.
[5]
J. Wolfers,et al.
Prediction Markets
,
2003
.
[6]
Mike Y. Chen,et al.
Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web
,
2001
.
[7]
Peter D. Wysocki.
Cheap Talk on the Web: The Determinants of Postings on Stock Message Boards
,
1998
.
[8]
Stephen E. Robertson,et al.
Probabilistic models of indexing and searching
,
1980,
SIGIR '80.
[9]
Yan Zhao,et al.
Analyzing Actors and Their Discussion Topics by Semantic Social Network Analysis
,
2006,
Tenth International Conference on Information Visualisation (IV'06).
[10]
Bo Pang,et al.
Thumbs up? Sentiment Classification using Machine Learning Techniques
,
2002,
EMNLP.
[11]
John Scott.
What is social network analysis
,
2010
.
[12]
Detlef Schoder,et al.
Web Science 2.0: Identifying Trends through Semantic Social Network Analysis
,
2008,
2009 International Conference on Computational Science and Engineering.