Bisecting for Selecting: Using a Laplacian Eigenmaps Clustering Approach to Create the New European Football Super League

We use European football performance data to select teams to form the proposed European football Super League, using only unsupervised techniques. We first used random forest regression to select important variables predicting goal difference, which we used to calculate the Euclidian distances between teams. Creating a Laplacian eigenmap, we bisected the Fielder vector to identify the five major European football leagues' natural clusters. Our results showed how an unsupervised approach could successfully identify four clusters based on five basic performance metrics: shots, shots on target, shots conceded, possession, and pass success. The top two clusters identify those teams who dominate their respective leagues and are the best candidates to create the most competitive elite super league.

[1]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[2]  Douglas B. Kell,et al.  Computational cluster validation in post-genomic data analysis , 2005, Bioinform..

[3]  N. Moltschaniwskyj,et al.  Elemental fingerprints of southern calamary ( Sepioteuthis australis ) reveal local recruitment sources and allow assessment of the importance of closed areas , 2011 .

[4]  Guillermo Ricardo Simari,et al.  Non-commercial Research and Educational Use including without Limitation Use in Instruction at Your Institution, Sending It to Specific Colleagues That You Know, and Providing a Copy to Your Institution's Administrator. All Other Uses, Reproduction and Distribution, including without Limitation Comm , 2022 .

[5]  Ian G. McHale,et al.  Time-varying ratings for international football teams , 2017, Eur. J. Oper. Res..

[6]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[7]  A. Bond,et al.  Competitive Intensity, Fans’ Expectations, and Match-Day Tickets Sold in the Italian Football Serie A, 2012-2015 , 2020, Journal of Sports Economics.

[8]  Anna Simonetto,et al.  Football Mining with R , 2014 .

[9]  A. Bond,et al.  A CUSUM tool for retrospectively evaluating team performance: the case of the English Premier League , 2020, Sport, Business and Management: An International Journal.

[10]  Andreas Heuer,et al.  Fitness, chance, and myths: an objective view on soccer results , 2008, 0803.0614.

[11]  Dries R. Goossens,et al.  Proactive and reactive strategies for football league timetabling , 2019, Eur. J. Oper. Res..

[12]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Spectral methods for graph clustering - A survey , 2011, Eur. J. Oper. Res..

[13]  Alan Julian Izenman,et al.  Modern Multivariate Statistical Techniques , 2008 .

[14]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[15]  Markus Weinmann,et al.  Beyond crowd judgments: Data-driven estimation of market value in association football , 2017, Eur. J. Oper. Res..

[16]  Alexander R. Griffing,et al.  On the Fiedler vectors of graphs that arise from trees by Schur complementation of the Laplacian. , 2009, Linear algebra and its applications.

[17]  M. B. Wright,et al.  50 years of OR in sport , 2009, J. Oper. Res. Soc..

[18]  Paola Zuccolotto,et al.  Analysis and correction of bias in Total Decrease in Node Impurity measures for tree-based algorithms , 2010, Stat. Comput..

[19]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[20]  Edwin R. Hancock,et al.  Spectral Simplification of Graphs , 2004, ECCV.

[21]  Guy N. Brock,et al.  clValid , an R package for cluster validation , 2008 .

[22]  Agha Iqbal Ali,et al.  Prescriptive analytics for FIFA World Cup lodging capacity planning , 2017, J. Oper. Res. Soc..

[23]  Ramón Flores,et al.  Decision taking under pressure: Evidence on football manager dismissals in Argentina and their consequences , 2012, Eur. J. Oper. Res..

[24]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[25]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[26]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[27]  Ian G. McHale,et al.  Plus-minus player ratings for soccer , 2017, Eur. J. Oper. Res..

[28]  David Butler,et al.  Expert performance and crowd wisdom: Evidence from English Premier League predictions , 2021, Eur. J. Oper. Res..

[29]  T. Kohonen SELF-ORGANIZING MAPS: OPHMIZATION APPROACHES , 1991 .

[30]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[31]  A. Bond,et al.  Changing the sport product: marketing implications for championship rugby league clubs in the United Kingdom , 2020, Managing Sport and Leisure.

[32]  D. R. Cutler,et al.  Utah State University From the SelectedWorks of , 2017 .

[33]  C. Breuer,et al.  The Financial Impact of (Foreign) Private Investors on Team Investments and Profits in Professional Football: Empirical Evidence from the Premier League , 2016 .

[34]  Marco Sandri,et al.  A Bias Correction Algorithm for the Gini Variable Importance Measure in Classification Trees , 2008 .

[35]  Raul Caruso,et al.  The Determinants of the TV Demand for Soccer: Empirical Evidence on Italian Serie A for the Period 2008-2015 , 2019 .

[36]  Panagiotis E. Dimitropoulos,et al.  Managing the European football industry: UEFA’s regulatory intervention and the impact on accounting quality , 2016 .

[37]  W. Andreff,et al.  Walter C. Neale 50 Years After , 2015 .

[38]  Guillermo Durán,et al.  Scheduling the South American Qualifiers to the 2018 FIFA World Cup by integer programming , 2017, Eur. J. Oper. Res..

[39]  S. Emmonds,et al.  Hidden dynamics of soccer leagues: The predictive ‘power’ of partial standings , 2019, PloS one.

[40]  A. Elo The rating of chessplayers, past and present , 1978 .

[41]  Orhan Firat,et al.  Parallel Spectral Graph Partitioning on CUDA , 2014 .

[42]  M. Fiedler A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory , 1975 .

[43]  Young Hoon Lee,et al.  A Bias-Corrected Estimator of Competitive Balance in Sports Leagues , 2019 .

[44]  Nicholas King,et al.  Competitive Balance Measures in Sports Leagues: The Effects of Variation in Season Length , 2015 .

[45]  S. Carpenter,et al.  Predicting walleye recruitment as a tool for prioritizing management actions , 2015 .

[46]  W. Neale The Peculiar Economics of Professional Sports : A Contribution to the Theory of the Firm in Sporting Competition and in Market Competition , 2007 .

[47]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.