Predicting Freeway Incident Duration Using Machine Learning

Traffic incident duration provides valuable information for traffic management officials and road users alike. Conventional mathematical models may not necessarily capture the complex interaction between the many variables affecting incident duration. This paper summarizes the application of five state-of-the-art machine learning (ML) models for predicting traffic incident duration. More than 110,000 incident records with over 52 variables were retrieved from Houston TranStar data archive. The attempted ML techniques include: regression decision tree, support vector machine (SVM), ensemble tree (bagged and boosted), Gaussian process regression (GPR), and artificial neural networks (ANN). These methods are known to effectively handle extensive and complex datasets. Towards achieving the best modeling accuracy, the parameters of each of these models were fine-tuned. The results showed that the SVM and GPR models outperformed other techniques in terms of the mean absolute error (MAE) with the best model scoring an MAE of 14.34 min. On the other hand, the simple regression tree was the worst overall model with an MAE of 16.74 min. In terms of training time, a considerable difference was found between two groups of models: regression decision tree, ensemble tree, and ANN on one hand and SVM and GPR on the other. The former required shorter training time (less than one hour each) whereas the latter had training times ranging between 5 to 34 hours per model.

[1]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[2]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[3]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[4]  S. Travis Waller,et al.  Prediction of Pavement Performance: Application of Support Vector Regression with Different Kernels , 2016 .

[5]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[6]  Briggs,et al.  ORGANIZING FOR REGIONAL TRANSPORTATION OPERATIONS: HOUSTON TRANSTAR , 2001 .

[7]  Hendrik Blockeel,et al.  Top-Down Induction of First Order Logical Decision Trees , 1998, AI Commun..

[8]  Eleni I. Vlahogianni,et al.  Fuzzy‐Entropy Neural Network Freeway Incident Duration Modeling with Single and Competing Uncertainties , 2013, Comput. Aided Civ. Infrastructure Eng..

[9]  Kelvin K. W. Yau,et al.  Predicting electricity energy consumption: A comparison of regression analysis, decision tree and neural networks , 2007 .

[10]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[11]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[12]  Gaetano Valenti,et al.  A comparative study of models for the incident duration prediction , 2010 .

[13]  Younshik Chung,et al.  Development of an accident duration prediction model on the Korean Freeway Systems. , 2010, Accident; analysis and prevention.

[14]  Bin Ran,et al.  A prediction model of bus arrival time at stops with multi-routes , 2017 .

[15]  K. El-Basyouny,et al.  Comparison of Two Negative Binomial Regression Techniques in Developing Accident Prediction Models , 2006 .

[16]  Ying Lee,et al.  Sequential forecast of incident duration using Artificial Neural Network models. , 2007, Accident; analysis and prevention.

[17]  Sh Givargis,et al.  A basic neural traffic noise prediction model for Tehran's roads. , 2010, Journal of environmental management.

[18]  Jun Hu,et al.  Incident Duration Prediction for In-vehicle Navigation System , 2011 .

[19]  J R Stewart,et al.  Applications of Classification and Regression Tree Methods in Roadway Safety Studies , 1996 .

[20]  Shuyan Chen,et al.  Traffic Incident Duration Prediction Based on Support Vector Regression , 2011 .

[21]  Haitham Al-Deek,et al.  Estimating Magnitude and Duration of Incident Delays , 1997 .

[22]  Asad J. Khattak,et al.  Modeling Traffic Incident Duration Using Quantile Regression , 2016 .

[23]  Byungkyu Park,et al.  Route choice modeling with Support Vector Machine , 2017 .

[24]  M. Ben-Akiva,et al.  Competing risks mixture model for traffic incident duration prediction. , 2015, Accident; analysis and prevention.

[25]  Simon Washington,et al.  Hazard based models for freeway traffic incident duration. , 2013, Accident; analysis and prevention.

[26]  Ying Lee,et al.  A Computerized Feature Selection Method Using Genetic Algorithms to Forecast Freeway Accident Duration Times , 2010, Comput. Aided Civ. Infrastructure Eng..

[27]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[28]  Asad J. Khattak,et al.  What is the Role of Multiple Secondary Incidents in Traffic Operations , 2010 .

[29]  Hyung Jin Kim,et al.  A COMPARATIVE ANALYSIS OF INCIDENT SERVICE TIME ON URBAN FREEWAYS , 2001 .

[30]  Qiao Shi,et al.  Estimating Freeway Incident Duration Using Accelerated Failure Time Modeling , 2013 .

[31]  L. Baker,et al.  The wisdom of crowds — ensembles and modules in environmental modelling , 2008 .

[32]  Zhenlong Li,et al.  Performance analysis of K-nearest neighbor, support vector machine, and artificial neural network classifiers for driver drowsiness detection with different road geometries , 2017, Int. J. Distributed Sens. Networks.

[33]  Shivangi Nigam,et al.  Vehicular traffic noise modeling using artificial neural network approach , 2014 .

[34]  J. Olden,et al.  A new R2-based metric to shed greater insight on variable importance in artificial neural networks , 2015 .

[35]  Alan Julian Izenman,et al.  Modern Multivariate Statistical Techniques , 2008 .

[36]  Haidar Samet,et al.  A new hybrid Modified Firefly Algorithm and Support Vector Regression model for accurate Short Term Load Forecasting , 2014, Expert Syst. Appl..