Auto-play: A Data Mining Approach to ODI Cricket Simulation and Prediction

Cricket is a popular sport played by 16 countries, is the second most watched sport in the world after soccer, and enjoys a multi-million dollar industry. There is tremendous interest in simulating cricket and more importantly in predicting the outcome of games, particularly in their one-day international format. The complex rules governing the game, along with the numerous natural parameters affecting the outcome of a cricket match present significant challenges for accurate prediction. Multiple diverse parameters, including but not limited to cricketing skills and performances, match venues and even weather conditions can significantly affect the outcome of a game. The sheer number of parameters, along with their interdependence and variance create a non-trivial challenge to create an accurate quantitative model of a game Unlike other sports such as basketball and baseball which are well researched from a sports analytics perspective, for cricket, these tasks have yet to be investigated in depth. In this paper, we build a prediction system that takes in historical match data as well as the instantaneous state of a match, and predicts future match events culminating in a victory or loss. We model the game using a subset of match parameters, using a combination of linear regression and nearestneighbor clustering algorithms. We describe our model and algorithms and finally present quantitative results, demonstrating the performance of our algorithms in predicting the number of runs scored, one of the most important determinants of match outcome.

[1]  David Beaudoin The best batsmen and bowlers in one-day cricket , 2003 .

[2]  Stefan Luckner,et al.  On the Forecast Accuracy of Sports Prediction Markets , 2006, Negotiation, Auctions, and Market Engineering.

[3]  P. S. Gill,et al.  Modelling and simulation for one‐day cricket , 2009 .

[4]  K. A. A. D. Raj,et al.  Application of Association Rule Mining: A case study on team India , 2013, 2013 International Conference on Computer Communication and Informatics.

[5]  A. J. Lewis Towards fairer measures of player performance in one-day cricket , 2005, J. Oper. Res. Soc..

[6]  R. Snee,et al.  Ridge Regression in Practice , 1975 .

[7]  Tim B. Swartz,et al.  Optimal batting orders in one-day cricket , 2006, Comput. Oper. Res..

[8]  Francis K. H. Quek,et al.  Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets , 2003, Pattern Recognit..

[9]  P. Allsopp,et al.  Rating teams and analysing outcomes in one‐day and test cricket , 2004 .

[10]  Inderpal S. Bhandari,et al.  Advanced Scout: Data Mining and Knowledge Discovery in NBA Data , 2004, Data Mining and Knowledge Discovery.

[11]  Amal Kaluarachchi,et al.  CricAI: A classification based tool to predict the outcome in ODI cricket , 2010, 2010 Fifth International Conference on Information and Automation for Sustainability.

[12]  A. J. Lewis,et al.  A fair method for resetting the target in interrupted one-day cricket matches , 1998, J. Oper. Res. Soc..

[13]  Michael Bailey,et al.  Predicting the Match Outcome in One Day International Cricket Matches, while the Game is in Progress. , 2006, Journal of sports science & medicine.

[14]  Muhammad Asif,et al.  A modified Duckworth-Lewis method for adjusting targets in interrupted limited overs cricket , 2013, Eur. J. Oper. Res..

[15]  John V. Guttag,et al.  A data-driven method for in-game decision making in MLB: when to pull a starting pitcher , 2013, KDD.

[16]  Hermanus H. Lemmer,et al.  An analysis of players\' performances in the first cricket Twenty20 World Cup series , 2008 .