A Kriging based spatiotemporal approach for traffic volume data imputation

Along with the rapid development of Intelligent Transportation Systems, traffic data collection technologies have progressed fast. The emergence of innovative data collection technologies such as remote traffic microwave sensor, Bluetooth sensor, GPS-based floating car method, and automated license plate recognition, has significantly increased the variety and volume of traffic data. Despite the development of these technologies, the missing data issue is still a problem that poses great challenge for data based applications such as traffic forecasting, real-time incident detection, dynamic route guidance, and massive evacuation optimization. A thorough literature review suggests most current imputation models either focus on the temporal nature of the traffic data and fail to consider the spatial information of neighboring locations or assume the data follow a certain distribution. These two issues reduce the imputation accuracy and limit the use of the corresponding imputation methods respectively. As a result, this paper presents a Kriging based data imputation approach that is able to fully utilize the spatiotemporal correlation in the traffic data and that does not assume the data follow any distribution. A set of scenarios with different missing rates are used to evaluate the performance of the proposed method. The performance of the proposed method was compared with that of two other widely used methods, historical average and K-nearest neighborhood. Comparison results indicate that the proposed method has the highest imputation accuracy and is more flexible compared to other methods.

[1]  Alan G. White,et al.  INCORPORATING VOLATILITY UPDATING INTO THE HISTORICAL SIMULATION METHOD FOR VALUE AT RISK , 1998 .

[2]  Ming Zhong,et al.  Matching Patterns for Updating Missing Values of Traffic Counts , 2006 .

[3]  Massimo Aria,et al.  Accurate Tree-based Missing Data Imputation and Data Fusion within the Statistical Learning Paradigm , 2012, J. Classif..

[4]  Ming Zhong,et al.  Genetically Designed Models for Accurate Imputation of Missing Traffic Counts , 2004 .

[5]  Shawn Turner,et al.  Defining and Measuring Traffic Data Quality: White Paper on Recommended Approaches , 2004 .

[6]  Chuan Ding,et al.  Prioritizing Influential Factors for Freeway Incident Clearance Time Prediction Using the Gradient Boosting Decision Trees Method , 2017, IEEE Transactions on Intelligent Transportation Systems.

[7]  William T. Scherer,et al.  Exploring Imputation Techniques for Missing Data in Transportation Management Systems , 2003 .

[8]  Christopher R. Cherry,et al.  Use characteristics and demographics of rural transit riders: a case study in Tennessee , 2017 .

[9]  Yi Zhang,et al.  A BPCA based missing value imputing method for traffic flow volume data , 2008, 2008 IEEE Intelligent Vehicles Symposium.

[10]  Yunpeng Wang,et al.  Large-Scale Transportation Network Congestion Evolution Prediction Using Deep Learning Theory , 2015, PloS one.

[11]  Billy M. Williams,et al.  Modeling and Forecasting Vehicular Traffic Flow as a Seasonal ARIMA Process: Theoretical Basis and Empirical Results , 2003, Journal of Transportation Engineering.

[12]  Lee D. Han,et al.  Short-Term Freeway Speed Profiling Based on Longitudinal Spatiotemporal Dynamics , 2014 .

[13]  Brian L. Smith,et al.  Short-term traffic flow prediction models-a comparison of neural network and nonparametric regression approaches , 1994, Proceedings of IEEE International Conference on Systems, Man and Cybernetics.

[14]  Russell Zaretzki,et al.  A GIS-based method to identify cost-effective routes for rural deviated fixed route transit , 2016 .

[15]  Daiheng Ni,et al.  Markov Chain Monte Carlo Multiple Imputation Using Bayesian Networks for Incomplete Intelligent Transportation Systems Data , 2005 .

[16]  Haitham Al-Deek,et al.  New Algorithms for Filtering and Imputation of Real-Time and Archived Dual-Loop Detector Data in I-4 Data Warehouse , 2004 .

[17]  Yong Wang,et al.  Learning Traffic as Images: A Deep Convolutional Neural Network for Large-Scale Transportation Network Speed Prediction , 2017, Sensors.

[18]  Matthew G. Karlaftis,et al.  A multivariate state space approach for urban traffic flow modeling and prediction , 2003 .

[19]  Yanru Zhang,et al.  Using an ARIMA-GARCH Modeling Approach to Improve Subway Short-Term Ridership Forecasting Accounting for Dynamic Volatility , 2018, IEEE Transactions on Intelligent Transportation Systems.

[20]  Sherif Ishak,et al.  A Hidden Markov Model for short term prediction of traffic conditions on freeways , 2014 .

[21]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[22]  Yi Zhang,et al.  PPCA-Based Missing Data Imputation for Traffic Flow Volume: A Systematical Approach , 2009, IEEE Transactions on Intelligent Transportation Systems.

[23]  Maxine A Marshall,et al.  A Guide for Planning and Operating Flexible Public Transportation Services , 2010 .

[24]  Yiannis Kamarianakis,et al.  Space-time modeling of traffic flow , 2002, Comput. Geosci..

[25]  D. T. Lee,et al.  Travel-time prediction with support vector regression , 2004, IEEE Transactions on Intelligent Transportation Systems.

[26]  Li Li,et al.  Missing traffic data: comparison of imputation methods , 2014 .

[27]  N. Cressie The origins of kriging , 1990 .

[28]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[29]  Zhaobin Liu,et al.  Imputation of Missing Traffic Data during Holiday Periods , 2008 .

[30]  Chuan Ding,et al.  Predicting Short-Term Subway Ridership and Prioritizing Its Influential Factors Using Gradient Boosting Decision Trees , 2016 .

[31]  Billy M. Williams,et al.  Urban Freeway Traffic Flow Prediction: Application of Seasonal Autoregressive Integrated Moving Average and Exponential Smoothing Models , 1998 .

[32]  Christopher R. Cherry,et al.  Statewide Rural-Urban Bus Travel Demand and Network Evaluation: an Application in Tennessee , 2012 .

[33]  Xiaolei Ma,et al.  Spatial Copula Model for Imputing Traffic Flow Data from Remote Microwave Sensors , 2017, Sensors.

[34]  Alexander Skabardonis,et al.  Detecting Errors and Imputing Missing Data for Single-Loop Surveillance Systems , 2003 .