Estimation of Missing Data in Intelligent Transportation System

Missing data is a challenge in many applications, including intelligent transportation systems (ITS). In this paper, we study traffic speed and travel time estimations in ITS, where portions of collected data are missing due to sensor instability and communication errors at collection points. These practical issues can be remediated by missing data analysis, which are mainly categorized as either statistical or machine learning (ML)-based approaches. Statistical methods require the priori probability distribution of the data which is unknown in our application. Therefore, we focus on an ML-based approach, Multi-Directional Recurrent Neural Network (M-RNN). M-RNN utilizes both temporal and spatial characteristics of the data. We evaluate the effectiveness of this approach on a TomTom dataset containing spatio-temporal measurements of average vehicle speed and travel time in the Greater Toronto Area (GTA). We evaluate the method under various conditions, where the results demonstrate that M-RNN outperforms existing solutions, e.g., spline interpolation and matrix completion, by up to 58% decreases in Root Mean Square Error (RMSE).

[1]  Li Li,et al.  Efficient missing data imputing for traffic flow by considering temporal and spatial dependence , 2013 .

[2]  Zhiyong Cui,et al.  Deep Bidirectional and Unidirectional LSTM Recurrent Neural Network for Network-wide Traffic Speed Prediction , 2018, ArXiv.

[3]  Robert Tibshirani,et al.  Spectral Regularization Algorithms for Learning Large Incomplete Matrices , 2010, J. Mach. Learn. Res..

[4]  Jinsung Yoon,et al.  Estimating Missing Data in Temporal Data Streams Using Multi-Directional Recurrent Neural Networks , 2017, IEEE Transactions on Biomedical Engineering.

[5]  M. Zhong,et al.  ESTIMATION OF MISSING TRAFFIC COUNTS USING FACTOR, GENETIC, NEURAL AND REGRESSION TECHNIQUES , 2004 .

[6]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[7]  Guangdong Feng,et al.  A Tensor Based Method for Missing Traffic Data Completion , 2013 .

[8]  B. Bakshi,et al.  Bayesian principal component analysis , 2002 .

[9]  Bin Ran,et al.  Tensor based missing traffic data completion with spatial–temporal correlation , 2016 .

[10]  Ahmed M. Alaa,et al.  Personalized Risk Scoring for Critical Care Prognosis Using Mixtures of Gaussian Processes , 2016, IEEE Transactions on Biomedical Engineering.

[11]  Yan Liu,et al.  Recurrent Neural Networks for Multivariate Time Series with Missing Values , 2016, Scientific Reports.

[12]  Adolf D. May,et al.  Traffic Flow Fundamentals , 1989 .

[13]  Yi Zhang,et al.  A BPCA based missing value imputing method for traffic flow volume data , 2008, 2008 IEEE Intelligent Vehicles Symposium.

[14]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[15]  Navdeep Jaitly,et al.  Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[16]  D. Helbing Fundamentals of traffic flow , 1997, cond-mat/9806080.

[17]  Song Gao,et al.  An Imputation Method for Missing Traffic Data Based on FCM Optimized by PSO-SVR , 2018 .

[18]  H Lieu,et al.  TRAFFIC-FLOW THEORY , 1999 .

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Huachun Tan,et al.  Low Multilinear Rank Approximation of Tensors and Application in Missing Traffic Data , 2014 .

[21]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[22]  Christopher M. Bishop,et al.  Bayesian PCA , 1998, NIPS.

[23]  H. J. Van Zuylen,et al.  Accurate freeway travel time prediction with state-space neural networks under missing data , 2005 .