Data Mining and Complex Network Algorithms for Traffic Accident Analysis

The field of traffic accident analysis has long been dominated by traditional statistical analysis. With the recent advances in data collection, storage, and archival methods, the size of accident data sets has grown significantly. This result in turn has motivated research on applying data mining and complex network analysis algorithms to traffic accident analysis; the data mining and complex network analysis algorithms are designed specifically to handle data sets with large dimensions. This paper explores the potential for using two such methods–-namely, a modularity-optimizing community detection algorithm and the association rule learning algorithm–-to identify important accident characteristics. As a case study, the algorithms were applied to an accident data set compiled for Interstate 190 in the Buffalo–Niagara, New York, metropolitan area. Specifically, the community detection algorithm was used to cluster the data to reduce the inherent heterogeneity, and then the association rule learning algorithm was applied to each cluster to discern meaningful patterns within each, related particularly to high accident frequency locations (hot spots) and incident clearance time. To demonstrate the benefits of clustering, the association rule algorithm was also applied to the whole data set (before clustering) and the results were compared with those discovered from the clusters. The study results indicated that (a) the community detection algorithm was quite effective in identifying clusters with discernible characteristics, (b) clustering helped unveil relationships and accident causative factors that remained hidden when the analysis was performed on the whole data set, and (c) the association rule learning algorithm yielded useful insight into accident hot spots and incident clearance time along I-190.

[1]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[2]  Kyriacos C. Mouskos,et al.  Black spots identification through a Bayesian Networks quantification of accident risk index , 2013 .

[3]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[4]  Atsuyuki Okabe,et al.  A kernel density estimation method for networks, its computational method and a GIS‐based tool , 2009, Int. J. Geogr. Inf. Sci..

[5]  Geert Wets,et al.  Traffic accident segmentation by means of latent class clustering. , 2008, Accident; analysis and prevention.

[6]  Sergio Gómez,et al.  Size reduction of complex networks preserving modularity , 2007, ArXiv.

[7]  Emilio Ferrara,et al.  A large-scale community structure analysis in Facebook , 2011, EPJ Data Science.

[8]  Silvio Brusaferro,et al.  Risk factors for fatal road traffic accidents in Udine, Italy. , 2002, Accident; analysis and prevention.

[9]  Jianfeng Xi,et al.  A Hybrid Algorithm of Traffic Accident Data Mining on Cause Analysis , 2013 .

[10]  Mohamed Abdel-Aty,et al.  Comprehensive analysis of vehicle-pedestrian crashes at intersections in Florida. , 2005, Accident; analysis and prevention.

[11]  Satish V. Ukkusuri,et al.  A clustering regression approach: A comprehensive injury severity analysis of pedestrian-vehicle cr , 2013 .

[12]  Bani K. Mallick,et al.  ROADWAY TRAFFIC CRASH MAPPING: A SPACE-TIME MODELING APPROACH , 2003 .

[13]  Fred L Mannering,et al.  A note on modeling vehicle accident frequencies with random-parameters count models. , 2009, Accident; analysis and prevention.

[14]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  K. Vanhoof,et al.  Profiling of High-Frequency Accident Locations by Use of Association Rules , 2003 .

[16]  Paul P Jovanis,et al.  Method for Identifying Factors Contributing to Driver-Injury Severity in Traffic Crashes , 2000 .

[17]  Roger Bird,et al.  Analyzing Clearance Time of Urban Traffic Accidents in Abu Dhabi, United Arab Emirates, with Hazard-Based Duration Modeling Method , 2011 .

[18]  Indrajit Ghosh Examination of the Factors Influencing the Clearance Time of Freeway Incidents , 2012 .

[19]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[20]  Jun Yan,et al.  Kernel Density Estimation of traffic accidents in a network space , 2008, Comput. Environ. Urban Syst..

[21]  Tessa K Anderson,et al.  Kernel density estimation and K-means clustering to profile road accident hotspots. , 2009, Accident; analysis and prevention.

[22]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[23]  Richard Andrášik,et al.  Identification of hazardous road locations of traffic accidents by means of kernel density estimation and cluster significance evaluation. , 2013, Accident; analysis and prevention.

[24]  M G Karlaftis,et al.  Heterogeneity considerations in accident modeling. , 1998, Accident; analysis and prevention.

[25]  Dominique Lord,et al.  The statistical analysis of highway crash-injury severities: a review and assessment of methodological alternatives. , 2011, Accident; analysis and prevention.

[26]  Fred L. Mannering,et al.  The statistical analysis of crash-frequency data: A review and assessment of methodological alternatives , 2010 .

[27]  Griselda López,et al.  Analysis of traffic accidents on rural highways using Latent Class Clustering and Bayesian Networks. , 2013, Accident; analysis and prevention.