Bayesian analysis of Formula One race results: disentangling driver skill and constructor advantage

Successful performance in Formula One is determined by combination of both the driver's skill and race-car constructor advantage. This makes key performance questions in the sport difficult to answer. For example, who is the best Formula One driver, which is the best constructor, and what is their relative contribution to success? In this paper, we answer these questions based on data from the hybrid era in Formula One (2014 - 2021 seasons). We present a novel Bayesian multilevel rank-ordered logit regression method to model individual race finishing positions. We show that our modelling approach describes our data well, which allows for precise inferences about driver skill and constructor advantage. We conclude that Hamilton and Verstappen are the best drivers in the hybrid era, the top-three teams (Mercedes, Ferrari, and Red Bull) clearly outperform other constructors, and approximately 88% of the variance in race results is explained by the constructor. We argue that this modelling approach may prove useful for sports beyond Formula One, as it creates performance ratings for independent components contributing to success.

[1]  Oliver Budzinski,et al.  Measuring Competitive Balance in Formula One Racing , 2019, Outcome Uncertainty in Sporting Events.

[2]  Martin Ingram A point-based Bayesian hierarchical model to predict the outcome of tennis matches , 2019, Journal of Quantitative Analysis in Sports.

[3]  Aki Vehtari,et al.  Visualization in Bayesian workflow , 2017, Journal of the Royal Statistical Society: Series A (Statistics in Society).

[4]  Paul-Christian Bürkner,et al.  brms: An R Package for Bayesian Multilevel Models Using Stan , 2017 .

[5]  Daniel A. Henderson,et al.  A Comparison of Truncated and Time-Weighted Plackett–Luce Models for Probabilistic Forecasting of Formula One Results , 2017 .

[6]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[7]  Aki Vehtari,et al.  Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC , 2015, Statistics and Computing.

[8]  Kelvyn Jones,et al.  Formula for success: Multilevel modelling of Formula One Driver and Constructor performance, 1950–2014 , 2016 .

[9]  Richard McElreath,et al.  Statistical Rethinking: A Bayesian Course with Examples in R and Stan , 2015 .

[10]  Ellen L. Hamaker,et al.  UvA-DARE ( Digital Academic Repository ) To center or not to center ? Investigating inertia with a multilevel autoregressive model , 2014 .

[11]  M. Glickman,et al.  A stochastic rank ordered logit model for rating multi-competitor games and sports , 2015 .

[12]  Andrew J. K. Phillips,et al.  Uncovering Formula One driver performances from 1950 to 2013 by adjusting for team and competition effects , 2014 .

[13]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[14]  A. Zeileis,et al.  Beta Regression in R , 2010 .

[15]  David Stadelmann,et al.  Who Is The Best Formula 1 Driver? An Economic Approach to Evaluating Talent , 2009 .

[16]  Andrew Gelman,et al.  Multilevel (Hierarchical) Modeling: What It Can and Cannot Do , 2006, Technometrics.

[17]  E. Wagenmakers,et al.  A psychometric analysis of chess expertise. , 2005, The American journal of psychology.

[18]  S. Ferrari,et al.  Beta Regression for Modelling Rates and Proportions , 2004 .