Effects of Pacing Properties on Performance in Long-Distance Running

This article focuses on the performance of runners in official races. Based on extensive public data from participants of races organized by the Boston Athletic Association, we demonstrate how different pacing profiles can affect the performance in a race. An athlete's pacing profile refers to the running speed at various stages of the race. We aim to provide practical, data-driven advice for professional as well as recreational runners. Our data collection covers 3 years of data made public by the race organizers, and primarily concerns the times at various intermediate points, giving an indication of the speed profile of the individual runner. We consider the 10 km, half marathon, and full marathon, leading to a data set of 120,472 race results. Although these data were not primarily recorded for scientific analysis, we demonstrate that valuable information can be gleaned from these substantial data about the right way to approach a running challenge. In this article, we focus on the role of race distance, gender, age, and the pacing profile. Since age is a crucial but complex determinant of performance, we first model the age effect in a gender- and distance-specific manner. We consider polynomials of high degree and use cross-validation to select models that are both accurate and of sufficient generalizability. After that, we perform clustering of the race profiles to identify the dominant pacing profiles that runners select. Finally, after having compensated for age influences, we apply a descriptive pattern mining approach to select reliable and informative aspects of pacing that most determine an optimal performance. The mining paradigm produces relatively simple and readable patterns, such that both professionals and amateurs can use the results to their benefit.