Analysis of Factors Contributing to the Severity of Large Truck Crashes

Crashes that involved large trucks often result in immense human, economic, and social losses. To prevent and mitigate severe large truck crashes, factors contributing to the severity of these crashes need to be identified before appropriate countermeasures can be explored. In this research, we applied three tree-based machine learning (ML) techniques, i.e., random forest (RF), gradient boost decision tree (GBDT), and adaptive boosting (AdaBoost), to analyze the factors contributing to the severity of large truck crashes. Besides, a mixed logit model was developed as a baseline model to compare with the factors identified by the ML models. The analysis was performed based on the crash data collected from the Texas Crash Records Information System (CRIS) from 2011 to 2015. The results of this research demonstrated that the GBDT model outperforms other ML methods in terms of its prediction accuracy and its capability in identifying more contributing factors that were also identified by the mixed logit model as significant factors. Besides, the GBDT method can effectively identify both categorical and numerical factors, and the directions and magnitudes of the impacts of the factors identified by the GBDT model are all reasonable and explainable. Among the identified factors, driving under the influence of drugs, alcohol, and fatigue are the most important factors contributing to the severity of large truck crashes. In addition, the exists of curbs and medians and lanes and shoulders with sufficient width can prevent severe large truck crashes.

[1]  S. H. Ho,et al.  Spatial distribution of flying Tribolium castaneum (Coleoptera: Tenebrionidae) in a rice warehouse , 1995 .

[2]  Helai Huang,et al.  A stable and optimized neural network model for crash injury severity prediction. , 2014, Accident; analysis and prevention.

[3]  Mohamed M Ahmed,et al.  Effects of truck traffic on crash injury severity on rural highways in Wyoming using Bayesian binary logit models. , 2018, Accident; analysis and prevention.

[4]  Jaeyoung Lee,et al.  Investigating macro-level hotzone identification and variable importance using big data: A random forest models approach , 2016, Neurocomputing.

[5]  Kirolos Haleem,et al.  Effect of driver's age and side of impact on crash severity along urban freeways: a mixed logit approach. , 2013, Journal of safety research.

[6]  Faming Liang,et al.  Crash Injury Severity Analysis Using a Bayesian Ordered Probit Model , 2007 .

[7]  Jinjun Tang,et al.  Crash injury severity analysis using a two-layer Stacking framework. , 2019, Accident; analysis and prevention.

[8]  Chengcheng Xu,et al.  Identification of freeway crash-prone traffic conditions for traffic flow at different levels of service , 2014 .

[9]  HuJia,et al.  Investigating macro-level hotzone identification and variable importance using big data , 2016 .

[10]  G W Mercer,et al.  Alcohol, drugs, and impairment in fatal traffic accidents in British Columbia. , 1995, Accident; analysis and prevention.

[11]  Chandra R. Bhat,et al.  Analytic methods in accident research: Methodological frontier and future directions , 2014 .

[12]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[13]  Nilam Ram,et al.  Studying Intraindividual Variability: What We Have Learned That Will Help Us Understand Lives in Context , 2004 .

[14]  Salvador Hernandez,et al.  An empirical analysis of run-off-road injury severity crashes involving large trucks. , 2017, Accident; analysis and prevention.

[15]  Ting Fu,et al.  Effects of Lane Width, Lane Position and Edge Shoulder Width on Driving Behavior in Underground Urban Expressways: A Driving Simulator Study , 2016, International journal of environmental research and public health.

[16]  Xiaogang Su,et al.  Interaction Trees with Censored Survival Data , 2008, The international journal of biostatistics.

[17]  Denver Tolliver,et al.  A Gradient Boosting Crash Prediction Approach for Highway-Rail Grade Crossing Crash Analysis , 2020, Journal of Advanced Transportation.

[18]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[19]  Sigal Kaplan,et al.  Analysis of factors associated with injury severity in crashes involving young New Zealand drivers. , 2014, Accident; analysis and prevention.

[20]  Li-Yen Chang,et al.  Analysis of driver injury severity in truck-involved accidents using a non-parametric classification tree model , 2013 .

[21]  M. Azimi,et al.  Roadway-Related Truck Crash Risk Analysis: Case Studies in Texas , 2018, Transportation Research Record: Journal of the Transportation Research Board.

[22]  Mohamed Abdel-Aty,et al.  Utilizing support vector machine in real-time crash risk evaluation. , 2013, Accident; analysis and prevention.

[23]  Mohamed Abdel-Aty,et al.  Analyzing crash injury severity for a mountainous freeway incorporating real-time traffic and weather data , 2014 .

[24]  Ilkka Norros,et al.  Accident risk of road and weather conditions on different road types. , 2019, Accident; analysis and prevention.

[25]  Pengfei Liu,et al.  Exploring injury severity in head-on crashes using latent class clustering analysis and mixed logit model: A case study of North Carolina. , 2019, Accident; analysis and prevention.

[26]  F Mannering,et al.  Analysis of injury severity and vehicle occupancy in truck- and non-truck-involved accidents. , 1999, Accident; analysis and prevention.

[27]  Jasmine Pahukula,et al.  A time of day analysis of crashes involving large trucks in urban areas. , 2015, Accident; analysis and prevention.

[28]  Ana de Almeida,et al.  Prediction of Road Accident Severity Using the Ordered Probit Model , 2014 .

[29]  Denver Tolliver,et al.  Accident Prediction Accuracy Assessment for Highway-Rail Grade Crossings Using Random Forest Algorithm Compared with Decision Tree , 2020, Reliab. Eng. Syst. Saf..