Missing traffic flow data prediction using least squares support vector machines in urban arterial streets

Accurate traffic parameters such as traffic flow, travel speeds and occupancies, are crucial to effective management of intelligent transportation systems (ITS). Some traffic data from loop detectors settled in arterial streets are incomplete, and the importance of effectively imputing the missing values emerges. In this paper, a technique called least squares support vector machines (LS-SVMs) is introduced to predict missing traffic flow based on spatio-temporal analysis in urban arterial streets. To the best of our knowledge, it is the first time to apply the rising computational intelligence (CI) technique incorporating with state space approach in missing traffic data imputation. Having good generalization ability and guaranteeing global minima ensure its well performance in the area. A baseline imputation technique, expectation maximization/data augmentation (EM/DA), is selected for comparison because of its proved effectiveness in missing data recovery. Through persuasive comparisons of the techniques, the proposed model is proved to be more applicable and performs better in stability and adaptability, which reveals that it is a promising approach in missing data prediction.

[1]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[2]  Matthew G. Karlaftis,et al.  A multivariate state space approach for urban traffic flow modeling and prediction , 2003 .

[3]  Rod E. Turochy Enhancing Short-Term Traffic Forecasting with Traffic Condition Information , 2006 .

[4]  D. Rubin Multiple Imputation After 18+ Years , 1996 .

[5]  Brian Lee Smith,et al.  Investigation of Extraction, Transformation, and Loading Techniques for Traffic Data Warehouses , 2004 .

[6]  Serge P. Hoogendoorn,et al.  Toward a Robust Framework for Freeway Travel time Prediction: Experiments with Simple Imputation and State-Space Neural Networks , 2003 .

[7]  Angshuman Guin,et al.  Multiple Imputation Scheme for Overcoming the Missing Values and Variability Issues in ITS Data , 2005 .

[8]  Francis J Mulhern,et al.  A nearest neighbor model for forecasting market response , 1994 .

[9]  Shiliang Sun,et al.  The Selective Random Subspace Predictor for Traffic Flow Forecasting , 2007, IEEE Transactions on Intelligent Transportation Systems.

[10]  Stephen D. Clark,et al.  Traffic Prediction Using Multivariate Nonparametric Regression , 2003 .

[11]  Haitham Al-Deek,et al.  New Algorithms for Filtering and Imputation of Real-Time and Archived Dual-Loop Detector Data in I-4 Data Warehouse , 2004 .

[12]  Ming Zhong,et al.  Genetically Designed Models for Accurate Imputation of Missing Traffic Counts , 2004 .

[13]  David E. Booth,et al.  Analysis of Incomplete Multivariate Data , 2000, Technometrics.

[14]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[15]  Johan A. K. Suykens,et al.  Optimal control by least squares support vector machines , 2001, Neural Networks.

[16]  Alexander Skabardonis,et al.  Detecting Errors and Imputing Missing Data for Single-Loop Surveillance Systems , 2003 .

[17]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .