Urban Link Travel Time Prediction Based on a Gradient Boosting Method Considering Spatiotemporal Correlations

The prediction of travel times is challenging because of the sparseness of real-time traffic data and the intrinsic uncertainty of travel on congested urban road networks. We propose a new gradient–boosted regression tree method to accurately predict travel times. This model accounts for spatiotemporal correlations extracted from historical and real-time traffic data for adjacent and target links. This method can deliver high prediction accuracy by combining simple regression trees with poor performance. It corrects the error found in existing models for improved prediction accuracy. Our spatiotemporal gradient–boosted regression tree model was verified in experiments. The training data were obtained from big data reflecting historic traffic conditions collected by probe vehicles in Wuhan from January to May 2014. Real-time data were extracted from 11 weeks of GPS records collected in Wuhan from 5 May 2014 to 20 July 2014. Based on these data, we predicted link travel time for the period from 21 July 2014 to 25 July 2014. Experiments showed that our proposed spatiotemporal gradient–boosted regression tree model obtained better results than gradient boosting, random forest, or autoregressive integrated moving average approaches. Furthermore, these results indicate the advantages of our model for urban link travel time prediction.

[1]  Toshiyuki Yamamoto,et al.  Feasibility of Using Taxi Dispatch System as Probes for Collecting Traffic Information , 2009, J. Intell. Transp. Syst..

[2]  Chung-Cheng Lu,et al.  A bayesian dynamic linear model approach for real-time short-term freeway travel time prediction , 2011 .

[3]  Eleni I. Vlahogianni,et al.  Short-term traffic forecasting: Where we are and where we’re going , 2014 .

[4]  Billy M. Williams,et al.  Urban Freeway Traffic Flow Prediction: Application of Seasonal Autoregressive Integrated Moving Average and Exponential Smoothing Models , 1998 .

[5]  Qingquan Li,et al.  Map-matching algorithm for large-scale low-frequency floating car data , 2014, Int. J. Geogr. Inf. Sci..

[6]  Wei-Chiang Hong,et al.  Traffic flow forecasting by seasonal SVR with chaotic simulated annealing algorithm , 2011, Neurocomputing.

[7]  Xing Xie,et al.  An Interactive-Voting Based Map Matching Algorithm , 2010, 2010 Eleventh International Conference on Mobile Data Management.

[8]  Horst Bischof,et al.  Semi-Supervised Random Forests , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  Thomas Stützle,et al.  A simple and effective iterated greedy algorithm for the permutation flowshop scheduling problem , 2007, Eur. J. Oper. Res..

[10]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[11]  Yue Yang Flowing Car Data Map-Matching Based on Constrained Shortest Path Algorithm , 2013 .

[12]  Mohamed Abdel-Aty,et al.  Application of Stochastic Gradient Boosting Technique to Enhance Reliability of Real-Time Risk Assessment , 2013 .

[13]  Philip J Tarnoff,et al.  Data Collection of Freeway Travel Time Ground Truth with Bluetooth Sensors , 2010 .

[14]  L. Breiman Arcing the edge , 1997 .

[15]  H. J. Van Zuylen,et al.  Bayesian committee of neural networks to predict travel times with confidence intervals , 2009 .

[16]  Zuo Ting Real-Time Map Matching Algorithm Based on Low-Sampling-Rate Probe Vehicle Data , 2013 .

[17]  Yanru Zhang,et al.  A gradient boosting method to improve travel time prediction , 2015 .

[18]  Yu Zheng,et al.  Travel time estimation of a path using sparse trajectories , 2014, KDD.

[19]  Alexander Mendiburu,et al.  A Review of Travel Time Estimation and Forecasting for Advanced Traveler Information Systems , 2012 .

[20]  Lei Zhang,et al.  Freeway Travel-Time Estimation Based on Temporal–Spatial Queueing Model , 2013, IEEE Transactions on Intelligent Transportation Systems.

[21]  Jin Wang,et al.  Short-term traffic speed forecasting hybrid model based on Chaos–Wavelet Analysis-Support Vector Machine theory , 2013 .

[22]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[23]  Haris N. Koutsopoulos,et al.  Dynamic data-driven local traffic state estimation and prediction , 2013 .

[24]  Zhixiang Fang,et al.  What about people in pedestrian navigation? , 2015, Geo spatial Inf. Sci..

[25]  Wei Guo,et al.  Analyzing Urban Human Mobility Patterns through a Thematic Model at a Finer Scale , 2016, ISPRS Int. J. Geo Inf..

[26]  Alois Knoll,et al.  Gradient boosting machines, a tutorial , 2013, Front. Neurorobot..

[27]  Zhang Yunfei Automated Matching Urban Road Networks Using Probabilistic Relaxation , 2012 .

[28]  Yanru Zhang,et al.  A hybrid short-term traffic flow forecasting method based on spectral analysis and statistical volatility model , 2014 .

[29]  Benjamin Hamner,et al.  Predicting Travel Times with Context-Dependent Random Forests by Modeling Local and Aggregate Traffic Flow , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[30]  Wu Fangguo Estimation of Average Link Travel Time Using Fuzzy C-Mean , 2011 .

[31]  Li Li,et al.  Efficient missing data imputing for traffic flow by considering temporal and spatial dependence , 2013 .

[32]  Mu-Chen Chen,et al.  Forecasting the short-term metro passenger flow with empirical mode decomposition and neural networks , 2012 .

[33]  Yuan Tian,et al.  Understanding intra-urban trip patterns from taxi trajectory data , 2012, Journal of Geographical Systems.

[34]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[35]  Yang Zhaosheng Individual vehicle travel-time estimation based on GPS data and analysis of vehicle running characteristics , 2010 .

[36]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[37]  Li Gong,et al.  Revealing travel patterns and city structure with taxi trip data , 2016 .

[38]  Zhang Wei Comparison of link travel-time estimation methods based on GPS equipped floating car , 2009 .

[39]  Karl Pearson,et al.  ON THE DISTRIBUTION OF THE CORRELATION COEFFICIENT IN SMALL SAMPLES. APPENDIX II TO THE PAPERS OF “STUDENT” AND R. A. FISHER. A COOPERATIVE STUDY , 1917 .

[40]  Yu Zheng,et al.  U-Air: when urban air quality inference meets big data , 2013, KDD.

[41]  Michael J Demetsky,et al.  TRAFFIC FLOW FORECASTING: COMPARISON OF MODELING APPROACHES , 1997 .

[42]  Philippe Bonnifait,et al.  Matching Raw GPS Measurements on a Navigable Map Without Computing a Global Position , 2012, IEEE Transactions on Intelligent Transportation Systems.

[43]  Ruey S. Tsay,et al.  Analysis of Financial Time Series , 2005 .

[44]  Yi-Shih Chung,et al.  Factor complexity of crash occurrence: An empirical demonstration using boosted regression trees. , 2013, Accident; analysis and prevention.

[45]  Terrence J. Sejnowski,et al.  Unsupervised Learning , 2018, Encyclopedia of GIS.

[46]  G. Tutz,et al.  An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. , 2009, Psychological methods.

[47]  J. Friedman Stochastic gradient boosting , 2002 .

[48]  Yao Wang,et al.  Prediction of weather impacted airport capacity using ensemble learning , 2011, 2011 IEEE/AIAA 30th Digital Avionics Systems Conference.

[49]  Theresa L. Utlaut,et al.  Introduction to Time Series Analysis and Forecasting , 2008 .

[50]  Xing Xie,et al.  Urban computing with taxicabs , 2011, UbiComp '11.

[51]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[52]  Wanli Min,et al.  Real-time road traffic prediction with spatio-temporal correlations , 2011 .

[53]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1972 .

[54]  Peter L. Bartlett,et al.  Boosting Algorithms as Gradient Descent in Function Space , 2007 .

[55]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.