Stacked Bidirectional and Unidirectional LSTM Recurrent Neural Network for Forecasting Network-wide Traffic State with Missing Values

Short-term traffic forecasting based on deep learning methods, especially recurrent neural networks (RNN), has received much attention in recent years. However, the potential of RNN-based models in traffic forecasting has not yet been fully exploited in terms of the predictive power of spatial-temporal data and the capability of handling missing data. In this paper, we focus on RNN-based models and attempt to reformulate the way to incorporate RNN and its variants into traffic prediction models. A stacked bidirectional and unidirectional LSTM network architecture (SBU-LSTM) is proposed to assist the design of neural network structures for traffic state forecasting. As a key component of the architecture, the bidirectional LSTM (BDLSM) is exploited to capture the forward and backward temporal dependencies in spatiotemporal data. To deal with missing values in spatial-temporal data, we also propose a data imputation mechanism in the LSTM structure (LSTM-I) by designing an imputation unit to infer missing values and assist traffic prediction. The bidirectional version of LSTM-I is incorporated in the SBU-LSTM architecture. Two real-world network-wide traffic state datasets are used to conduct experiments and published to facilitate further traffic prediction research. The prediction performance of multiple types of multi-layer LSTM or BDLSTM models is evaluated. Experimental results indicate that the proposed SBU-LSTM architecture, especially the two-layer BDLSTM network, can achieve superior performance for the network-wide traffic prediction in both accuracy and robustness. Further, comprehensive comparison results show that the proposed data imputation mechanism in the RNN-based models can achieve outstanding prediction performance when the model's input data contains different patterns of missing values.

[1]  D. T. Lee,et al.  Travel-time prediction with support vector regression , 2004, IEEE Transactions on Intelligent Transportation Systems.

[2]  Billy M. Williams,et al.  Modeling and Forecasting Vehicular Traffic Flow as a Seasonal ARIMA Process: Theoretical Basis and Empirical Results , 2003, Journal of Transportation Engineering.

[3]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[4]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[5]  S. P. Hoogendoorn,et al.  Freeway Travel Time Prediction with State-Space Neural Networks: Modeling State-Space Dynamics with Recurrent Neural Networks , 2002 .

[6]  Wojciech Zaremba,et al.  An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[7]  Sattar Hashemi,et al.  Road Traffic Prediction Using Context-Aware Random Forest Based on Volatility Nature of Traffic Flows , 2013, ACIIDS.

[8]  Yunpeng Wang,et al.  Spatiotemporal Recurrent Convolutional Networks for Traffic Prediction in Transportation Networks , 2017, Sensors.

[9]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[10]  Cyrus Shahabi,et al.  A brief overview of machine learning methods for short-term traffic forecasting and future directions , 2018, SIGSPACIAL.

[11]  Eleni I. Vlahogianni,et al.  Short-term traffic forecasting: Where we are and where we’re going , 2014 .

[12]  Muhammad Tayyab Asif,et al.  Spatiotemporal Patterns in Large-Scale Traffic Speed Prediction , 2014, IEEE Transactions on Intelligent Transportation Systems.

[13]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[14]  Yong Wang,et al.  Learning Traffic as Images: A Deep Convolutional Neural Network for Large-Scale Transportation Network Speed Prediction , 2017, Sensors.

[15]  Cyrus Shahabi,et al.  Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting , 2017, ICLR.

[16]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[17]  Yunpeng Wang,et al.  Long short-term memory neural network for traffic speed prediction using remote microwave sensor data , 2015 .

[18]  J. W. C. van Lint,et al.  Online Learning Solutions for Freeway Travel Time Prediction , 2008, IEEE Transactions on Intelligent Transportation Systems.

[19]  Huachun Tan,et al.  Short-term traffic flow forecasting with spatial-temporal correlation in a hybrid deep learning framework , 2016, ArXiv.

[20]  Yan Liu,et al.  Recurrent Neural Networks for Multivariate Time Series with Missing Values , 2016, Scientific Reports.

[21]  Lijun Sun,et al.  Bayesian Temporal Factorization for Multidimensional Time Series Prediction , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Kaan Ozbay,et al.  Predicting Travel Times for the South Jersey Real-Time Motorist Information System , 2003 .

[23]  Ugur Demiryurek,et al.  Deep Learning: A Generic Approach for Extreme Condition Traffic Forecasting , 2017, SDM.

[24]  Peter C. Y. Chen,et al.  LSTM network: a deep learning approach for short-term traffic forecast , 2017 .

[25]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[26]  David M Kreindler,et al.  The effects of the irregular sample and missing data in time series analysis. , 2006, Nonlinear dynamics, psychology, and life sciences.

[27]  Yunpeng Wang,et al.  A spatiotemporal correlative k-nearest neighbor model for short-term traffic multistep forecasting , 2016 .

[28]  Bin Ran,et al.  Short-Term Traffic Prediction Based on Dynamic Tensor Completion , 2016, IEEE Transactions on Intelligent Transportation Systems.

[29]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[30]  Eleni I. Vlahogianni,et al.  Statistical methods versus neural networks in transportation research: Differences, similarities and some insights , 2011 .

[31]  Alexander Skabardonis,et al.  Detecting Errors and Imputing Missing Data for Single-Loop Surveillance Systems , 2003 .

[32]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[33]  Lijun Sun,et al.  A Bayesian tensor decomposition approach for spatiotemporal traffic data imputation , 2019, Transportation Research Part C: Emerging Technologies.

[34]  Zhiyong Cui,et al.  Deep Bidirectional and Unidirectional LSTM Recurrent Neural Network for Network-wide Traffic Speed Prediction , 2018, ArXiv.

[35]  David C. Kale,et al.  Directly Modeling Missing Data in Sequences with RNNs: Improved Classification of Clinical Time Series , 2016, MLHC.

[36]  Yiannis Kamarianakis,et al.  Characterizing regimes in daily cycles of urban traffic using smooth-transition regressions , 2010 .

[37]  Fei-Yue Wang,et al.  Long short-term memory model for traffic congestion prediction with online open data , 2016, 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC).

[38]  Shing Chung Josh Wong,et al.  Urban traffic flow prediction using a fuzzy-neural approach , 2002 .

[39]  Fei-Yue Wang,et al.  Travel time prediction with LSTM neural network , 2016, 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC).

[40]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[41]  D. Percival,et al.  Wavelet variance analysis for gappy time series , 2010 .

[42]  Navdeep Jaitly,et al.  Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[43]  Aníbal R. Figueiras-Vidal,et al.  Pattern classification with missing data: a review , 2010, Neural Computing and Applications.

[44]  B. Wells,et al.  Strategies for Handling Missing Data in Electronic Health Record Derived Data , 2013, EGEMS.

[45]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[46]  Xuan Song,et al.  DeepTransport: Prediction and Simulation of Human Mobility and Transportation Mode at a Citywide Level , 2016, IJCAI.

[47]  Hojjat Adeli,et al.  Wavelet Packet‐Autocorrelation Function Method for Traffic Flow Pattern Analysis , 2004 .

[48]  Ardeshir Faghri,et al.  APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS TO INTELLIGENT VEHICLE-HIGHWAY SYSTEMS , 1994 .

[49]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[50]  Yinhai Wang,et al.  Traffic Graph Convolutional Recurrent Neural Network: A Deep Learning Framework for Network-Scale Traffic Learning and Forecasting , 2018, IEEE Transactions on Intelligent Transportation Systems.

[51]  Jiawei Wang,et al.  Missing traffic data imputation and pattern discovery with a Bayesian augmented tensor factorization model , 2019, Transportation Research Part C: Emerging Technologies.

[52]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[53]  C. R. Deboor,et al.  A practical guide to splines , 1978 .

[54]  Fei-Yue Wang,et al.  Traffic Flow Prediction With Big Data: A Deep Learning Approach , 2015, IEEE Transactions on Intelligent Transportation Systems.

[55]  Zhiyong Cui,et al.  New progress of DRIVE Net: An E-science transportation platform for data sharing, visualization, modeling, and analysis , 2016, 2016 IEEE International Smart Cities Conference (ISC2).

[56]  Haitham Al-Deek,et al.  Predictions of Freeway Traffic Speeds and Volumes Using Vector Autoregressive Models , 2009, J. Intell. Transp. Syst..