Enhanced data reconciliation of freight rail dispatch data

Abstract In order to enable widespread use of data driven analysis for rail operations problems, large volumes of complete and clean data are needed. In this work a data reconciliation problem for rail dispatch data is proposed to automatically clean and complete noisy and incomplete data. The proposed method finds a minimally-perturbed modification of the observed historical data that satisfies operational constraints, such as feasibility of meet and overtake events. The method is demonstrated on a large historical dataset from freight rail territory in Tennessee, US, containing over 3000 train records over six months. The results show that data reconciliation reduces timing error of imputed points by up to 15% and increases the number of meet and overtake events estimated at the correct historical location from less than 40% to approximately 95%. It is also shown that regularizing the data reconciliation problem with historical train performance data further decreases the error of reconstructed points by 15%, and using an L 2 normalization can reduce mean squared error by over 50%. These findings indicate that the data reconciliation method is a useful preprocessing step for analysis and modeling of railroad operations that are based on real-world physical dispatching data.

[1]  A. J. Taylor,et al.  An introduction to computer‐assisted train dispatch , 1986 .

[2]  Pavle Kecman,et al.  Online Data-Driven Adaptive Prediction of Train Event Times , 2015, IEEE Transactions on Intelligent Transportation Systems.

[3]  Fernando Ordóñez,et al.  Modeling strategies for effectively routing freight trains through complex networks , 2016 .

[4]  Daniel B. Work,et al.  Prediction of arrival times of freight traffic on US railroads using support vector regression , 2018, Transportation Research Part C: Emerging Technologies.

[5]  Tyler A. Soderstrom,et al.  A mixed integer optimization approach for simultaneous data reconciliation and identification of measurement bias , 2000 .

[6]  Jan A. Persson,et al.  N-tracked railway traffic re-scheduling during disturbances , 2007 .

[7]  Alfonso Bahillo,et al.  A Survey of Train Positioning Solutions , 2017, IEEE Sensors Journal.

[8]  Martin Aronsson,et al.  A MILP-based heuristic for a commercial train timetabling problem , 2017 .

[9]  Hongwei Tong,et al.  Detection of gross erros in data reconciliation by principal component analysis , 1995 .

[10]  Srinivas Bollapragada,et al.  A Novel Movement Planner System for Dispatching Trains , 2018, Interfaces.

[11]  Davide Anguita,et al.  A dynamic, interpretable, and robust hybrid data analytics system for train movements in large-scale railway networks , 2019, International Journal of Data Science and Analytics.

[12]  Rob M.P. Goverde,et al.  Recent applications of big data analytics in railway transportation systems: A survey , 2018 .

[13]  L. Lasdon,et al.  Efficient data reconciliation and estimation for dynamic processes using nonlinear programming techniques , 1992 .

[14]  L. Biegler,et al.  Simultaneous strategies for data reconciliation and gross error detection of nonlinear systems , 1991 .

[15]  Alexandre M. Bayen,et al.  Convex Formulations of Data Assimilation Problems for a Class of Hamilton-Jacobi Equations , 2011, SIAM J. Control. Optim..

[16]  Norman W. Garrick,et al.  Data Reconciliation–Based Traffic Count Analysis System , 1998 .

[17]  Xin Yao,et al.  A Survey on Problem Models and Solution Approaches to Rescheduling in Railway Networks , 2015, IEEE Transactions on Intelligent Transportation Systems.

[18]  Erhan Kozan,et al.  Optimal scheduling of trains on a single line track , 1996 .