Pre-crash scenarios at road junctions: A clustering method for car crash data.

Given the recent advancements in autonomous driving functions, one of the main challenges is safe and efficient operation in complex traffic situations such as road junctions. There is a need for comprehensive testing, either in virtual simulation environments or on real-world test tracks. This paper presents a novel data analysis method including the preparation, analysis and visualization of car crash data, to identify the critical pre-crash scenarios at T- and four-legged junctions as a basis for testing the safety of automated driving systems. The presented method employs k-medoids to cluster historical junction crash data into distinct partitions and then applies the association rules algorithm to each cluster to specify the driving scenarios in more detail. The dataset used consists of 1056 junction crashes in the UK, which were exported from the in-depth "On-the-Spot" database. The study resulted in thirteen crash clusters for T-junctions, and six crash clusters for crossroads. Association rules revealed common crash characteristics, which were the basis for the scenario descriptions. The results support existing findings on road junction accidents and provide benchmark situations for safety performance tests in order to reduce the possible number parameter combinations.

[1]  Van Maren CORRELATION OF DESIGN AND CONTROL CHARACTERISTICS WITH ACCIDENTS AT RURAL MULTI-LANE HIGHWAY INTERSECTIONS IN INDIANA , 1977 .

[2]  Alfonso Montella,et al.  Identifying crash contributory factors at urban roundabouts and using association rules to explore their relationships to different crash types. , 2011, Accident; analysis and prevention.

[3]  Sébastien Lê,et al.  FactoMineR: An R Package for Multivariate Analysis , 2008 .

[4]  Antonio D’Ambrosio,et al.  Analysis of powered two-wheeler crashes in Italy by classification trees and rules discovery. , 2012, Accident; analysis and prevention.

[5]  Kurt Hornik,et al.  Mining Association Rules and Frequent Itemsets , 2015 .

[6]  Marina Plavsic Analysis and Modeling of Driver Behavior for Assistance Systems at Road Intersections , 2010 .

[7]  Bhagwant Persaud,et al.  Disaggregate Safety Performance Models for Signalized Intersections on Ontario Provincial Roads , 1998 .

[8]  Geert Wets,et al.  Crash Patterns at Signalized Intersections , 2015 .

[9]  Kay Fitzpatrick,et al.  MEDIAN INTERSECTION DESIGN , 1995 .

[10]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Renée J. Miller,et al.  LIMBO: Scalable Clustering of Categorical Data , 2004, EDBT.

[12]  Suzanne E. Lee,et al.  VEHICLE-BASED COUNTERMEASURES FOR SIGNAL AND STOP SIGN VIOLATIONS: TASK 1. INTERSECTION CONTROL VIOLATION CRASH ANALYSES; TASK 2. TOP-LEVEL SYSTEM AND HUMAN FACTORS REQUIREMENTS , 2004 .

[13]  Fernando C. Lourenço,et al.  Binary-based similarity measures for categorical data and their application in Self- Organizing Maps , 2004 .

[14]  Rod Troutbeck,et al.  Relationship Between Unsignalised Intersection Geometry and Accident Rates - A Literature Review , 2001 .

[15]  J. Bared,et al.  Accident Models for Two-Lane Rural Segments and Intersections , 1998 .

[16]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[17]  Michael K. Ng,et al.  A fuzzy k-modes algorithm for clustering categorical data , 1999, IEEE Trans. Fuzzy Syst..

[18]  Durga Toshniwal,et al.  A data mining framework to analyze road accident data , 2015, Journal of Big Data.

[19]  Richard Cuerden,et al.  The UK on the spot accident data collection study - phase II report , 2008 .

[20]  Joshua Zhexue Huang,et al.  A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining , 1997, DMKD.

[21]  Stefanie Seiler,et al.  Finding Groups In Data , 2016 .

[22]  K Obeng,et al.  Some determinants of possible injuries in crashes at signalized intersections. , 2007, Journal of safety research.

[23]  N A David,et al.  MOTOR VEHICLE ACCIDENTS IN RELATION TO GEOMETRIC AND TRAFFIC FEATURES OF HIGHWAY INTERSECTIONS: VOLUME II-RESEARCH REPORT , 1975 .

[24]  He Zengyou,et al.  Squeezer: an efficient algorithm for clustering categorical data , 2002 .

[25]  Mohamed Abdel-Aty,et al.  Market basket analysis of crash data from large jurisdictions and its potential as a decision support tool , 2009 .

[26]  Julian Hill,et al.  The methodology of on the spot accident investigations in the UK , 2001 .

[27]  Bin Dong,et al.  K-Histograms: An Efficient Clustering Algorithm for Categorical Dataset , 2005, ArXiv.

[28]  Zhiyuan Liu,et al.  Investigation of work zone crash casualty patterns using association rules. , 2016, Accident; analysis and prevention.

[29]  Kurt Hornik,et al.  Introduction to arules – A computational environment for mining association rules and frequent item sets , 2009 .

[30]  John T Hanna,et al.  Characteristics of intersection accidents in rural municipalities , 1976 .

[31]  Eun-Ha Choi,et al.  Crash Factors in Intersection-Related Crashes: An On-Scene Perspective , 2010 .

[32]  Jesper Sandin An analysis of common patterns in aggregated causation charts from intersection crashes. , 2009, Accident; analysis and prevention.

[33]  Philippe Nitsche,et al.  Requirements on tomorrow's road infrastructure for highly automated driving , 2014, 2014 International Conference on Connected Vehicles and Expo (ICCVE).

[34]  Peter Wegner,et al.  A technique for counting ones in a binary computer , 1960, CACM.

[35]  K. M. Bauer,et al.  STATISTICAL MODELS OF AT-GRADE INTERSECTION ACCIDENTS , 1996 .

[36]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[37]  Johannes Gehrke,et al.  CACTUS—clustering categorical data using summaries , 1999, KDD '99.

[38]  Fabrice Guillet,et al.  Improving the Discovery of Association Rules with Intensity of Implication , 1998, PKDD.

[39]  I. Summersgill,et al.  Accidents at urban priority crossroads and staggered junctions , 1996 .

[40]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[41]  Rajeev Motwani,et al.  Beyond Market Baskets: Generalizing Association Rules to Dependence Rules , 1998, Data Mining and Knowledge Discovery.

[42]  Thomas Wiltschko Sichere Information durch infrastrukturgestützte Fahrerassistenzsysteme zur Steigerung der Verkehrssicherheit an Straßenknotenpunkten , 2004 .

[43]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[44]  Chung-Chian Hsu,et al.  Generalizing self-organizing map for categorical data , 2006, IEEE Transactions on Neural Networks.

[45]  Pete Thomas,et al.  EU transport accident, incident and casualty databases: current status and future needs , 2001 .

[46]  Zengyou He,et al.  Squeezer: An efficient algorithm for clustering categorical data , 2008, Journal of Computer Science and Technology.

[47]  M Mages Top-Down-Funktionsentwicklung eines Einbiege- und Kreuzungsassistenten , 2009 .

[48]  L. A. Goodman Exploratory latent structure analysis using both identifiable and unidentifiable models , 1974 .

[49]  Sudipto Guha,et al.  ROCK: a robust clustering algorithm for categorical attributes , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[50]  H. Akaike A new look at the statistical model identification , 1974 .

[51]  Ahmad Mirabadi,et al.  Application of Association Rules in Iranian Railways (RAI) Accident Data Analysis , 2010 .

[52]  Claire L. Naing,et al.  Accident causation and pre-accidental driving situations: Part 1. Overview and general statistics , 2007 .

[53]  M Grimmer,et al.  Accidents at rural t junctions , 1986 .

[54]  David Beck Investigation of key crash types: rear-end crashes in urban and rural environments , 2015 .

[55]  Richard W. Hamming,et al.  Error detecting and error correcting codes , 1950 .

[56]  Kirolos Haleem,et al.  Using a reliability process to reduce uncertainty in predicting crashes at unsignalized intersections. , 2010, Accident; analysis and prevention.

[57]  Zengyou He,et al.  A Link Clustering Based Approach for Clustering Categorical Data , 2004, ArXiv.

[58]  Jon M. Kleinberg,et al.  Clustering categorical data: an approach based on dynamical systems , 2000, The VLDB Journal.

[59]  Mohamed Abdel-Aty,et al.  Analyzing angle crashes at unsignalized intersections using machine learning techniques. , 2011, Accident; analysis and prevention.

[60]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[61]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .