The potential of clustering methods to define intersection test scenarios: Assessing real-life performance of AEB.

Intersection accidents are frequent and harmful. The accident types 'straight crossing path' (SCP), 'left turn across path - oncoming direction' (LTAP/OD), and 'left-turn across path - lateral direction' (LTAP/LD) represent around 95% of all intersection accidents and one-third of all police-reported car-to-car accidents in Germany. The European New Car Assessment Program (Euro NCAP) have announced that intersection scenarios will be included in their rating from 2020; however, how these scenarios are to be tested has not been defined. This study investigates whether clustering methods can be used to identify a small number of test scenarios sufficiently representative of the accident dataset to evaluate Intersection Automated Emergency Braking (AEB). Data from the German In-Depth Accident Study (GIDAS) and the GIDAS-based Pre-Crash Matrix (PCM) from 1999 to 2016, containing 784 SCP and 453 LTAP/OD accidents, were analyzed with principal component methods to identify variables that account for the relevant total variances of the sample. Three different methods for data clustering were applied to each of the accident types, two similarity-based approaches, namely Hierarchical Clustering (HC) and Partitioning Around Medoids (PAM), and the probability-based Latent Class Clustering (LCC). The optimum number of clusters was derived for HC and PAM with the silhouette method. The PAM algorithm was both initiated with random start medoid selection and medoids from HC. For LCC, the Bayesian Information Criterion (BIC) was used to determine the optimal number of clusters. Test scenarios were defined from optimal cluster medoids weighted by their real-life representation in GIDAS. The set of variables for clustering was further varied to investigate the influence of variable type and character. We quantified how accurately each cluster variation represents real-life AEB performance using pre-crash simulations with PCM data and a generic algorithm for AEB intervention. The usage of different sets of clustering variables resulted in substantially different numbers of clusters. The stability of the resulting clusters increased with prioritization of categorical over continuous variables. For each different set of cluster variables, a strong in-cluster variance of avoided versus non-avoided accidents for the specified Intersection AEB was present. The medoids did not predict the most common Intersection AEB behavior in each cluster. Despite thorough analysis using various cluster methods and variable sets, it was impossible to reduce the diversity of intersection accidents into a set of test scenarios without compromising the ability to predict real-life performance of Intersection AEB. Although this does not imply that other methods cannot succeed, it was observed that small changes in the definition of a scenario resulted in a different avoidance outcome. Therefore, we suggest using limited physical testing to validate more extensive virtual simulations to evaluate vehicle safety.

[1]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[2]  James Lenard,et al.  Typical pedestrian accident scenarios for the development of autonomous emergency braking test protocols. , 2014, Accident; analysis and prevention.

[3]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[4]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[5]  D. Kaufman,et al.  Introduction to the lipoprotein series , 2006, Intensive Care Medicine.

[6]  Marie Chavent,et al.  Multivariate analysis of mixed data: The PCAmixdata R package , 2014 .

[7]  Carlo Giacomo Prato,et al.  Cyclist–Motorist Crash Patterns in Denmark: A Latent Class Clustering Approach , 2013, Traffic injury prevention.

[8]  Monica Menendez,et al.  Exploring the application of latent class cluster analysis for investigating pedestrian crash injury severities in Switzerland. , 2015, Accident; analysis and prevention.

[9]  Tessa K Anderson,et al.  Kernel density estimation and K-means clustering to profile road accident hotspots. , 2009, Accident; analysis and prevention.

[10]  M. Nowakowska Road Traffic Accident Patterns: A Conceptual Grouping Approach to Evaluate Crash Clusters , 2012 .

[11]  W. Gibson Three multivariate models: Factor analysis, latent structure analysis, and latent profile analysis , 1959 .

[12]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[13]  Aled Williams,et al.  The European New Car Assessment Programme: A historical review , 2016, Chinese journal of traumatology = Zhonghua chuang shang za zhi.

[14]  Anders Kullgren,et al.  Comparison Between Euro NCAP Test Results and Real-World Crash Data , 2010, Traffic injury prevention.

[15]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[16]  Matthew Avery,et al.  Autonomous Emergency Braking Test Results , 2013 .

[17]  J. Vermunt,et al.  Latent class cluster analysis , 2002 .

[18]  J. Gower A General Coefficient of Similarity and Some of Its Properties , 1971 .

[19]  Mario De Luca,et al.  Using a K-Means Clustering Algorithm to Examine Patterns of Vehicle Crashes in Before-After Analysis , 2013 .

[20]  J. Hagenaars,et al.  Applied Latent Class Analysis , 2003 .

[21]  Ulrich Sander,et al.  Opportunities and limitations for intersection collision intervention-A study of real world 'left turn across path' accidents. , 2017, Accident; analysis and prevention.

[22]  Carlo Giacomo Prato,et al.  Fatal and serious road crashes involving young New Zealand drivers: a latent class clustering approach , 2016, International journal of injury control and safety promotion.

[23]  Griselda López,et al.  Analysis of traffic accidents on rural highways using Latent Class Clustering and Bayesian Networks. , 2013, Accident; analysis and prevention.

[24]  Salimah H. Meghani,et al.  1842676957299765Latent class cluster analysis to understand heterogeneity in prostate cancer treatment utilities , 2009, BMC Medical Informatics Decis. Mak..

[25]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[26]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[27]  Eric Yamashita,et al.  Using a K-means clustering algorithm to examine patterns of pedestrian involved crashes in Honolulu, Hawaii , 2007 .

[28]  Anders Lie,et al.  The Correlation Between Pedestrian Injury Severity in Real-Life Crashes and Euro NCAP Pedestrian Test Results , 2011, Traffic injury prevention.

[29]  Christian Krettek,et al.  SCIENTIFIC APPROACH AND METHODOLOGY OF A NEW IN-DEPTH INVESTIGATION STUDY IN GERMANY CALLED GIDAS , 2003 .

[30]  Julie Josse,et al.  Principal component methods - hierarchical clustering - partitional clustering: why would we need to choose for visualizing data? , 2010 .

[31]  Michiel van Ratingen,et al.  Implementation of Autonomous Emergency Braking (AEB), the Next Step in Euro NCAP's Safety Assessment , 2013 .

[32]  Seong S. Chae,et al.  Cluster Analysis with Balancing Weight on Mixed-type Data , 2006 .

[33]  Anders Kullgren,et al.  The consequences of adopting a MAIS 3 injury target for road safety in the EU: A comparison with targets based on fatalities and long-term consequences , 2013 .

[34]  W. G. Cochran The $\chi^2$ Test of Goodness of Fit , 1952 .

[35]  H Hautzinger,et al.  Expansion of GIDAS sample data to the regional level: statistical methodology and practical experiences , 2005 .

[36]  Nils Lubbe,et al.  Prediction of Accident Evolution by Diversification of Influence Factors in Computer Simulation: Opportunities for Driver Warnings in Intersection Accidents , 2016 .

[37]  Michael L. Nelson,et al.  Clustering in object-oriented databases , 1992, OOPS.

[38]  Mia Hubert,et al.  Clustering in an object-oriented environment , 1997 .

[39]  Geert Wets,et al.  Traffic accident segmentation by means of latent class clustering. , 2008, Accident; analysis and prevention.