Low-dimensional models for missing data imputation in road networks

Intelligent transport systems (ITS) require data with high spatial and temporal resolution for applications such as modeling, traffic management, prediction and route guidance. However, field data is usually quite sparse. This problem of missing data severely limits the effectiveness of ITS. Missing values are usually imputed by either using historical data of the road or current information from neighboring links. In most scenarios, information from some or all of neighboring links might not be available. Furthermore, historical data may also be incomplete. To overcome these issues, we propose methods which can construct low-dimensional representation of large and diverse networks, in presence of missing historical and neighboring data. We use these low-dimensional models to reconstruct data profiles for road segments, and impute missing values. To this end we use Fixed Point Continuation with Approximate SVD (FPCA) and Canonical Polyadic (CP) decomposition for incomplete tensors to solve the problem of missing data. We apply these methods to expressways and a large urban road network to assess their performance for different scenarios.

[1]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[2]  Jieping Ye,et al.  Tensor Completion for Estimating Missing Values in Visual Data , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Yi Zhang,et al.  Spatial-temporal traffic data analysis based on global data management using MAS , 2004, IEEE Trans. Intell. Transp. Syst..

[4]  Gang Chang,et al.  Comparison of missing data imputation methods for traffic flow , 2011, Proceedings 2011 International Conference on Transportation, Mechanical, and Electrical Engineering (TMEE).

[5]  Petros Drineas,et al.  FAST MONTE CARLO ALGORITHMS FOR MATRICES II: COMPUTING A LOW-RANK APPROXIMATION TO A MATRIX∗ , 2004 .

[6]  Juha Karhunen,et al.  Principal Component Analysis for Sparse High-Dimensional Data , 2007, ICONIP.

[7]  William T. Scherer,et al.  Exploring Imputation Techniques for Missing Data in Transportation Management Systems , 2003 .

[8]  Paola Batistoni,et al.  International Conference , 2001 .

[9]  Yanmin Zhu,et al.  Challenges and Opportunities in Exploiting Large-Scale GPS Probe Data , 2011 .

[10]  Shiqian Ma,et al.  Fixed point and Bregman iterative methods for matrix rank minimization , 2009, Math. Program..

[11]  Angshuman Guin,et al.  Multiple Imputation Scheme for Overcoming the Missing Values and Variability Issues in ITS Data , 2005 .

[12]  Ming Zhong,et al.  Genetically Designed Models for Accurate Imputation of Missing Traffic Counts , 2004 .

[13]  Tamara G. Kolda,et al.  Scalable Tensor Factorizations for Incomplete Data , 2010, ArXiv.

[14]  Alexander Skabardonis,et al.  Detecting Errors and Imputing Missing Data for Single-Loop Surveillance Systems , 2003 .

[15]  Justin Dauwels,et al.  Tensor factorization for missing data imputation in medical questionnaires , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Minglu Li,et al.  Compressive Sensing Approach to Urban Traffic Sensing , 2011, 2011 31st International Conference on Distributed Computing Systems.

[17]  Yi Zhang,et al.  PPCA-Based Missing Data Imputation for Traffic Flow Volume: A Systematical Approach , 2009, IEEE Transactions on Intelligent Transportation Systems.

[18]  M. Zhong,et al.  ESTIMATION OF MISSING TRAFFIC COUNTS USING FACTOR, GENETIC, NEURAL AND REGRESSION TECHNIQUES , 2004 .

[19]  Fei-Yue Wang,et al.  Data-Driven Intelligent Transportation Systems: A Survey , 2011, IEEE Transactions on Intelligent Transportation Systems.

[20]  Fabien Moutarde,et al.  A New Traffic-Mining Approach for Unveiling Typical Global Evolutions of Large-Scale Road Networks , 2011 .

[21]  Wanli Min,et al.  Real-time road traffic prediction with spatio-temporal correlations , 2011 .

[22]  Dirk Helbing,et al.  Reconstructing the spatio-temporal traffic dynamics from stationary detector data , 2002 .