A mixed effects model for identifying goal scoring ability of footballers

type="main" xml:id="rssa12015-abs-0001"> The paper presents a model that can be used to identify the goal scoring ability of footballers. By decomposing the scoring process into the generation of shots and the conversion of shots to goals, abilities can be estimated from two mixed effects models. We compare several versions of our model as a tool for predicting the number of goals that a player will score in the following season with that of a naive method whereby a player's goals-per-minute ratio is assumed to be constant from one season to the next. We find that our model outperforms the naive model and that this outperformance can be attributed, in some part, to the model's disaggregating a player's ability and chance that may have influenced his goal scoring statistic in the previous season.

[1]  Shane T. Jensen,et al.  Bayesball: A Bayesian Hierarchical Model for Evaluating Fielding in Major League Baseball , 2008, 0802.4317.

[2]  Bruce G. Link,et al.  A multilevel analysis of income inequality and cardiovascular disease risk factors. , 2000, Social science & medicine.

[3]  James Algina,et al.  An Empirical Comparison of Statistical Models for Value-Added Assessment of School Performance , 2004 .

[4]  D. Karlis,et al.  Analysis of sports data by using bivariate Poisson models , 2003 .

[5]  L. Fahrmeir,et al.  Bayesian inference for generalized additive mixed models based on Markov random field priors , 2001 .

[6]  D. Singer,et al.  The effect of enforcing tobacco-sales laws on adolescents' access to tobacco and smoking behavior. , 1997, The New England journal of medicine.

[7]  T. Loughin,et al.  Assessing pitcher and catcher influences on base stealing in Major League Baseball , 2008, Journal of sports sciences.

[8]  D. Bates,et al.  Approximations to the Log-Likelihood Function in the Nonlinear Mixed-Effects Model , 1995 .

[9]  S. Raudenbush,et al.  Maximum Likelihood for Generalized Linear Models with Nested Random Effects via High-Order, Multivariate Laplace Approximation , 2000 .

[10]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[11]  Qing Liu,et al.  A note on Gauss—Hermite quadrature , 1994 .

[12]  J. Nelder,et al.  Generalized Linear Models with Random Effects: Unified Analysis via H-likelihood , 2006 .

[13]  Łkasz Szczepański,et al.  Measuring the effectiveness of strategies and quantifying players’ performance in football , 2008 .

[14]  Philip A. Scarf,et al.  On the Development of a Soccer Player Performance Rating System for the English Premier League , 2012, Interfaces.

[15]  K. Liang,et al.  Asymptotic Properties of Maximum Likelihood Estimators and Likelihood Ratio Tests under Nonstandard Conditions , 1987 .

[16]  Reza Amani,et al.  Cardiovascular Disease Risk Factors , 2012 .

[17]  Denis Réale,et al.  Genetic and plastic responses of a northern mammal to climate change , 2003, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[18]  S. Coles,et al.  Modelling Association Football Scores and Inefficiencies in the Football Betting Market , 1997 .

[19]  Jim Albert,et al.  A Bayesian Analysis of a Poisson Random Effects Model for Home Run Hitters , 1992 .

[20]  John B Carlin,et al.  Cannabis use and mental health in young people: cohort study , 2002, BMJ : British Medical Journal.

[21]  P. McCullagh,et al.  Generalized Linear Models , 1972, Predictive Analytics.

[22]  Gianluca Baio,et al.  Bayesian hierarchical model for the prediction of football results , 2010 .

[23]  Jim Albert,et al.  Pitching Statistics, Talent and Luck, and the Best Strikeout Seasons of All-Time , 2006 .

[24]  M. Maher Modelling association football scores , 1982 .

[25]  J. Duch,et al.  Quantifying the Performance of Individual Players in a Team Activity , 2010, PloS one.

[26]  Thomas Reilly,et al.  Applications of Logistic Regression to Shots at Goal in Association Football , 2005 .

[27]  Jean Pinquet,et al.  Allowance for Cost of Claims in Bonus-Malus Systems , 1997, ASTIN Bulletin.

[28]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[29]  Jiming Jiang Linear and Generalized Linear Mixed Models and Their Applications , 2007 .

[30]  Robert Gould,et al.  Longitudinal patterns and predictors of alcohol consumption in the United States. , 2005, American journal of public health.

[31]  Emiliano A. Valdez,et al.  Hierarchical Insurance Claims Modeling , 2008 .

[32]  C. Reep,et al.  Skill and Chance in Ball Games , 1971 .

[33]  B. Efron,et al.  Data Analysis Using Stein's Estimator and its Generalizations , 1975 .

[34]  James H. Fowler,et al.  Genetic Variation in Political Participation , 2008, American Political Science Review.