Who’s the Favourite? – A Bivariate Poisson Model for the UEFA European Football Championship 2016

Many approaches that analyze and predict the results of soccer matches are based on two independent pairwise Poisson distributions. The dependence between the scores of two competing teams is simply displayed by the inclusion of the covariate information of both teams. One objective of this article is to analyze if this type of modeling is appropriate or if an additional explicit modeling of the dependence structure for the joint score of a soccer match needs to be taken into account. Therefore, a specific bivariate Poisson model for the two numbers of goals scored by national teams competing in UEFA European football championship matches is fitted to all matches from the three previous European championships, including covariate information of both competing teams. A boosting approach is then used to select the relevant covariates. Based on the estimates, the current tournament is simulated 1,000,000 times to obtain winning probabilities for all participating national teams.

[1]  Achim Zeileis,et al.  Predictive Bookmaker Consensus Model for the UEFA Euro 2016 , 2016 .

[2]  Gerhard Tutz,et al.  Prediction of major international soccer tournaments based on team-specific regularized Poisson regression: An application to the FIFA World Cup 2014 , 2015 .

[3]  Benjamin Hofner,et al.  gamboostLSS: An R Package for Model Building and Variable Selection in the GAMLSS Framework , 2014, 1407.1774.

[4]  J. Gerhards,et al.  Die Berechnung des Siegers: Marktwert, Ungleichheit, Diversität und Routine als Einflussfaktoren auf die Leistung professioneller Fußballteams / Predictable Winners. Market Value, Inequality, Diversity, and Routine as Predictors of Success in European Soccer Leagues , 2014 .

[5]  H. Binder,et al.  Extending Statistical Boosting , 2014, Methods of Information in Medicine.

[6]  A Mayr,et al.  The Evolution of Boosting Algorithms , 2014, Methods of Information in Medicine.

[7]  A. Groll,et al.  Spain retains its title and sets a new record – generalized linear mixed models on European football championships , 2013 .

[8]  Benjamin Hofner,et al.  Generalized additive models for location, scale and shape for high dimensional data—a flexible approach based on boosting , 2012 .

[9]  A. Groll,et al.  A Study on European Football Championships in the GLMM Framework with an Emphasis on UEFA Champions League Experience , 2012 .

[10]  J. Gerhards,et al.  Keiner kommt an Spanien vorbei - außer dem Zufall , 2012 .

[11]  Ian G. McHale,et al.  Modelling the dependence of goals scored by opposing teams in international soccer matches , 2011 .

[12]  Torsten Hothorn,et al.  Estimation and regularization techniques for regression models with multidimensional prediction functions , 2010, Stat. Comput..

[13]  Torsten Hothorn,et al.  Boosting additive models using component-wise P-Splines , 2008, Comput. Stat. Data Anal..

[14]  Richard Pollard,et al.  Home Advantage in Football: A Current Review of an Unsolved Puzzle , 2008 .

[15]  John B Carlin,et al.  Regression models for twin studies: a critical review. , 2005, International journal of epidemiology.

[16]  R. Rigby,et al.  Generalized additive models for location, scale and shape , 2005 .

[17]  R. Pollard,et al.  Home advantage in soccer: a review of its existence and causes , 2005 .

[18]  Andrew B. Bernard,et al.  Who Wins the Olympic Games: Economic Resources and Medal Totals , 2004, Review of Economics and Statistics.

[19]  D. Karlis,et al.  Analysis of sports data by using bivariate Poisson models , 2003 .

[20]  Mark B. Andersen,et al.  World Cup Soccer home advantage. , 2002 .

[21]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[22]  Håvard Rue,et al.  Prediction and retrospective analysis of soccer matches in a league , 2000 .

[23]  David Dyte,et al.  A ratings based Poisson model for World Cup soccer simulation , 2000, J. Oper. Res. Soc..

[24]  Trevor Hastie,et al.  Additive Logistic Regression : a Statistical , 1998 .

[25]  S. Coles,et al.  Modelling Association Football Scores and Inefficiencies in the Football Betting Market , 1997 .

[26]  Alan J. Lee Modeling Scores in the Premier League: Is Manchester United Really the Best? , 1997 .

[27]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[28]  Stephen R. Clarke,et al.  Home ground advantage of individual clubs in English soccer , 1995 .

[29]  M. Maher Modelling association football scores , 1982 .