Exploring the Mechanism of Crashes with Autonomous Vehicles Using Machine Learning

The safety issue has become a critical obstacle that cannot be ignored in the marketization of autonomous vehicles (AVs). The objective of this study is to explore the mechanism of AV-involved crashes and analyze the impact of each feature on crash severity. We use the Apriori algorithm to explore the causal relationship between multiple factors to explore the mechanism of crashes. We use various machine learning models, including support vector machine (SVM), classification and regression tree (CART), and eXtreme Gradient Boosting (XGBoost), to analyze the crash severity. Besides, we apply the Shapley Additive Explanations (SHAP) to interpret the importance of each factor. The results indicate that XGBoost obtains the best result (recall = 75%; G-mean = 67.82%). Both XGBoost and Apriori algorithm effectively provided meaningful insights about AV-involved crash characteristics and their relationship. Among all these features, vehicle damage, weather conditions, accident location, and driving mode are the most critical features. We found that most rear-end crashes are conventional vehicles bumping into the rear of AVs. Drivers should be extremely cautious when driving in fog, snow, and insufficient light. Besides, drivers should be careful when driving near intersections, especially in the autonomous driving mode.

[1]  David A. Belsley A Guide to using the collinearity diagnostics , 1991, Computer Science in Economics and Management.

[2]  Dongpu Cao,et al.  Development of a new integrated local trajectory planning and tracking control framework for autonomous ground vehicles , 2017 .

[3]  Junqing Tang,et al.  Assessing intercity multimodal choice behavior in a Touristy City: A factor analysis , 2020, Journal of Transport Geography.

[4]  Ralph Helmar Rasshofer,et al.  Influences of weather phenomena on automotive laser radar systems , 2011 .

[5]  Jinxian Weng,et al.  Exploring shipping accident contributory factors using association rules , 2019 .

[6]  Youngchan Jang,et al.  Classification of motor vehicle crash injury severity: A hybrid approach for imbalanced data. , 2018, Accident; analysis and prevention.

[7]  Fred L. Mannering,et al.  The statistical analysis of crash-frequency data: A review and assessment of methodological alternatives , 2010 .

[8]  Sherif Ishak,et al.  An extreme gradient boosting method for identifying the factors contributing to crash/near-crash events: a naturalistic driving study , 2019, Canadian Journal of Civil Engineering.

[9]  Li-Yen Chang,et al.  Analysis of driver injury severity in truck-involved accidents using a non-parametric classification tree model , 2013 .

[10]  Charlotte H. Mason,et al.  Collinearity, power, and interpretation of multiple regression analysis. , 1991 .

[11]  W. Haddon The changing approach to the epidemiology, prevention, and amelioration of trauma: the transition to approaches etiologically rather than descriptively based. , 1968, American journal of public health and the nation's health.

[12]  Zhizhen Liu,et al.  Taxi Demand Prediction Based on a Combination Forecasting Model in Hotspots , 2020 .

[13]  Li Song,et al.  Crash Risk Evaluation and Crash Severity Pattern Analysis for Different Types of Urban Junctions: Fault Tree Analysis and Association Rules Approaches , 2019, Transportation Research Record: Journal of the Transportation Research Board.

[14]  Marin Marinov,et al.  A Study on Recent Developments and Issues with Obstacle Detection Systems for Automated Vehicles , 2020, Sustainability.

[15]  Yuanhua Jia,et al.  Identifying Factors that Influence the Patterns of Road Crashes Using Association Rules: A case Study from Wisconsin, United States , 2019, Sustainability.

[16]  Long T. Truong,et al.  Studying the Safety Impact of Autonomous Vehicles Using Simulation-Based Surrogate Safety Measures , 2018 .

[17]  Hong Chen,et al.  Analysis of Factors Affecting the Severity of Automated Vehicle Crashes Using XGBoost Model Combining POI Data , 2020, Journal of Advanced Transportation.

[18]  Kun Tang,et al.  Exploring spatial variation of the bus stop influence zone with multi-source data: A case study in Zhenjiang, China , 2019, Journal of Transport Geography.

[19]  Guangming Xiong,et al.  A model predictive speed tracking control approach for autonomous ground vehicles , 2017 .

[20]  Pravin Varaiya,et al.  Making intersections safer with I2V communication , 2018, Transportation Research Part C: Emerging Technologies.

[21]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[22]  Neville A. Stanton,et al.  Effects of adaptive cruise control and highly automated driving on workload and situation awareness: A review of the empirical evidence , 2014 .

[23]  Zhixia Li,et al.  Exploring the mechanism of crashes with automated vehicles using statistical modeling approaches , 2019, PloS one.

[24]  Ali Movahedi,et al.  Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis. , 2019, Accident; analysis and prevention.

[25]  S Chandra,et al.  Diagnosis of tuberculosis--newer tests. , 1994, The Journal of the Association of Physicians of India.

[26]  Behram Wali,et al.  Exploratory analysis of automated vehicle crashes in California: A text analytics & hierarchical Bayesian heterogeneity-based approach. , 2019, Accident; analysis and prevention.

[27]  Antonio D’Ambrosio,et al.  Analysis of powered two-wheeler crashes in Italy by classification trees and rules discovery. , 2012, Accident; analysis and prevention.

[28]  Chen Wang,et al.  Statistical analysis of the patterns and characteristics of connected and autonomous vehicle involved crashes. , 2019, Journal of safety research.

[29]  Shengrui Zhang,et al.  Analysis of Factors Affecting Hit-and-Run and Non-Hit-and-Run in Vehicle-Bicycle Crashes: A Non-Parametric Approach Incorporating Data Imbalance Treatment , 2019, Sustainability.

[30]  Sanjay Ranka,et al.  Deployment and Testing of Optimized Autonomous and Connected Vehicle Trajectories at a Closed-Course Signalized Intersection , 2018 .

[31]  Zongzhi Li,et al.  Comparing Factors Affecting Injury Severity of Passenger Car and Truck Drivers , 2020, IEEE Access.

[32]  Philipp Wintersberger,et al.  Effects of exhaust gases on laser scanner data quality at low ambient temperatures , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[33]  Marian-Andrei Rizoiu,et al.  Arterial incident duration prediction using a bi-level framework of extreme gradient-tree boosting , 2019, ArXiv.

[34]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[35]  J. Friedman Stochastic gradient boosting , 2002 .

[36]  Zhixia Li,et al.  Exploring causes and effects of automated vehicle disengagement using statistical modeling and classification tree based on field test data. , 2019, Accident; analysis and prevention.

[37]  Jack Stilgoe,et al.  Machine learning, social learning and the governance of self-driving cars , 2017, Social studies of science.

[38]  Dongpu Cao,et al.  A Review of Research on Traffic Conflicts Based on Intelligent Vehicles , 2020, IEEE Access.

[39]  W. Haddon,et al.  The changing approach to the epidemiology, prevention, and amelioration of trauma: the transition to approaches etiologically rather than descriptively based , 1999, Injury prevention : journal of the International Society for Child and Adolescent Injury Prevention.

[40]  Tormod Næs,et al.  Understanding the collinearity problem in regression and discriminant analysis , 2001 .

[41]  Vijay Gadepally,et al.  A Framework for Estimating Long Term Driver Behavior , 2016, ArXiv.

[42]  Chengcheng Xu,et al.  Association rule analysis of factors contributing to extraordinarily severe traffic crashes in China. , 2018, Journal of safety research.

[43]  F. Mannering Temporal instability and the analysis of highway accident data , 2018 .