Prediction of arrival times of freight traffic on US railroads using support vector regression

Abstract Variability of travel times on the United States freight rail network is high due to large network demands relative to infrastructure capacity, especially when traffic is heterogeneous. Variable runtimes pose significant operational challenges if the nature of runtime variability is not predictable. To address this issue, this article proposes a data-driven approach to predict estimated times of arrival (ETAs) of individual freight trains, based on the properties of the train, the properties of the network, and the properties of potentially conflicting traffic on the network. The ETA prediction problem from an origin to a destination is posed as a machine learning regression problem and solved using support vector regression trained and cross validated on over two years of detailed historical data for a 140 mile section of track located primarily in Tennessee, USA. The article presents the data used in this problem and details on feature engineering and construction for predictions made across the full route. It also highlights findings on the dominant sources of runtime variability and the most predictive factors for ETA. Improvement results for ETA exceed 21% over a baseline prediction method at some locations and average 14% across the study area.

[1]  Andrade Furtado,et al.  U.S. and European Freight Railways: The Differences That Matter , 2013 .

[2]  Pavle Kecman,et al.  Online Data-Driven Adaptive Prediction of Train Event Times , 2015, IEEE Transactions on Intelligent Transportation Systems.

[3]  Erhan Kozan,et al.  Modelling delay risks associated with a train schedule , 1995 .

[4]  D. Basak,et al.  Support Vector Regression , 2008 .

[5]  Marco Pranzo,et al.  An Advanced Real-Time Train Dispatching System for Minimizing the Propagation of Delays in a Dispatching Area Under Severe Disturbances , 2009 .

[6]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[7]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[8]  Shi Mu,et al.  Scheduling freight trains traveling on complex networks , 2011 .

[9]  Arjang A. Assad,et al.  MODELS FOR RAIL TRANSPORTATION , 1980 .

[10]  Michael F. Gorman,et al.  Statistical estimation of railroad congestion delay , 2009 .

[11]  Alexander Mendiburu,et al.  A review of travel time estimation and forecasting for Advanced Traveller Information Systems , 2015 .

[12]  Maged Dessouky,et al.  A delay estimation technique for single and double-track railroads , 2010 .

[13]  Buyue Qian,et al.  Improving rail network velocity: A machine learning approach to predictive maintenance , 2014 .

[14]  Ismail Sahin,et al.  Railway traffic control and train scheduling based oninter-train conflict management , 1999 .

[15]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[16]  Lingyun Meng,et al.  Advanced monitoring and management information of railway operations , 2011, J. Rail Transp. Plan. Manag..

[17]  Patrick T. Harker,et al.  Two Moments Estimation of the Delay on Single-Track Rail Lines with Scheduled Traffic , 1990, Transp. Sci..

[18]  Maged M. Dessouky,et al.  Modeling train movements through complex rail networks , 2004, TOMC.

[19]  Andrea D'Ariano,et al.  Conflict Resolution and Train Speed Coordination for Solving Real-Time Timetable Perturbations , 2007, IEEE Transactions on Intelligent Transportation Systems.

[20]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[21]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[22]  Paul Schonfeld,et al.  Analyzing passenger train arrival delays with support vector regression , 2015 .

[23]  Anders Peterson,et al.  Improving train service reliability by applying an effective timetable robustness strategy , 2017, J. Intell. Transp. Syst..

[24]  Leo G. Kroon,et al.  Reliability and Heterogeneity of Railway Services , 2006, Eur. J. Oper. Res..

[25]  Steven I-Jy Chien,et al.  Dynamic Bus Arrival Time Prediction with Artificial Neural Networks , 2002 .

[26]  Laurence R. Rilett,et al.  Advanced Prediction of Train Arrival and Crossing Times at Highway-Railroad Grade Crossings , 2000 .

[27]  Christopher P. L. Barkan,et al.  Impact of Train Type Heterogeneity on Single-Track Railway Capacity , 2009 .

[28]  A. J. Taylor,et al.  A Structured Model for Rail Line Simulation and Optimization , 1982 .

[29]  Patrick T. Harker,et al.  REAL-TIME SCHEDULING OF FREIGHT RAILROADS , 1995 .

[30]  Patrick T. Harker,et al.  Predicting on-time performance in scheduled railroad operations: methodology and application to train scheduling , 1998 .

[31]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[32]  Marin Marinov,et al.  A mesoscopic simulation modelling methodology for analyzing and evaluating freight train operations in a rail network , 2011, Simul. Model. Pract. Theory.