Extracting collective probabilistic forecasts from web games

Game sites on the World Wide Web draw people from around the world with specialized interests, skills, and knowledge. Data from the games often reflects the players' expertise and will to win. We extract probabilistic forecasts from data obtained from three online games: the Hollywood Stock Exchange (HSX), the Foresight Exchange (FX), and the Formula One Pick Six (F1P6) competition. We find that all three yield accurate forecasts of uncertain future events. In particular, prices of so-called "movie stocks" on HSX are good indicators of actual box office returns. Prices of HSX securities in Oscar, Emmy, and Grammy awards correlate well with observed frequencies of winning. FX prices are reliable indicators of future developments in science and technology. Collective predictions from players in the F1 competition serve as good forecasts of true race outcomes. In some cases, forecasts induced from game data are more reliable than expert opinions. We argue that web games naturally attract well-informed and well-motivated players, and thus offer a valuable and oft-overlooked source of high-quality data with significant predictive value.

[1]  R. N. Rosett,et al.  Gambling and Rationality , 1965, Journal of Political Economy.

[2]  R. Hanson Could gambling save science? Encouraging an honest consensus , 1995 .

[3]  Martin L. Weitzman,et al.  Utility Analysis and Group Behavior: An Empirical Study , 1965, Journal of Political Economy.

[4]  R. Thaler,et al.  Anomalies Parimutuel Betting Markets: Racetracks and Lotteries , 1988 .

[5]  Russell J. Lundholm,et al.  Information Aggregation in an Experimental Market. , 1990 .

[6]  Edward H. Kaplan,et al.  March Madness and the Office Pool , 2001, Manag. Sci..

[7]  R. Zare,et al.  Search for Past Life on Mars: Possible Relic Biogenic Activity in Martian Meteorite ALH84001 , 1996, Science.

[8]  Thomas G. Dietterich Machine-Learning Research Four Current Directions , 1997 .

[9]  Padhraic Smyth,et al.  Knowledge Discovery and Data Mining: Towards a Unifying Framework , 1996, KDD.

[10]  C. Plott,et al.  Rational Expectations and the Aggregation of Diverse Information in Laboratory Security Markets , 1988 .

[11]  Tok Wang Ling,et al.  IntelliClean: a knowledge-based intelligent data cleaner , 2000, KDD '00.

[12]  C. Plott,et al.  Efficiency of Experimental Security Markets with Insider Information: An Application of Rational-Expectations Models , 1982, Journal of Political Economy.

[13]  Robert Forsythe,et al.  Anatomy of an Experimental Political Stock Market , 1992 .

[14]  Thomas A. Rietz,et al.  Wishes, expectations and actions: a survey on price formation in election stock markets , 1999 .

[15]  Henry A. Kautz,et al.  Hardening soft information sources , 2000, KDD '00.

[16]  Mukhtar M. Ali Probability and Utility Estimates for Racetrack Bettors , 1977, Journal of Political Economy.

[17]  David M. Pennock,et al.  The Real Power of Artificial Markets , 2001, Science.

[18]  Winston C. Yang,et al.  Parimutuel betting markets as information aggregation devices: experimental results , 2003 .

[19]  C. Plott Markets as Information Gathering Tools , 2000 .

[20]  Christian Genest,et al.  Combining Probability Distributions: A Critique and an Annotated Bibliography , 1986 .

[21]  C. Lee Giles,et al.  Efficient identification of Web communities , 2000, KDD '00.

[22]  W. W. Snyder,et al.  HORSE RACING: TESTING THE EFFICIENT MARKETS MODEL , 1978 .

[23]  A. H. Murphy,et al.  “Good” Probability Assessors , 1968 .

[24]  David M. Pennock,et al.  The Power of Play: Efficiency and Forecast Accuracy in Web Market Games , 2000 .

[25]  Richard D. Hackathorn,et al.  Web Farming for the Data Warehouse , 1998 .

[26]  R. Clemen Combining forecasts: A review and annotated bibliography , 1989 .

[27]  Paul Resnick,et al.  Recommender systems , 1997, CACM.

[28]  J. Gandar,et al.  Informed Traders and Price Variations in the Betting Market for Professional Basketball Games , 1998 .

[29]  P. Garcia,et al.  Recovering probabilistic information from option markets: Tests of distributional assumptions , 1996 .

[30]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.