A clustering regression approach: A comprehensive injury severity analysis of pedestrian-vehicle cr

Understanding the underlying relationship between pedestrian injury severity outcomes and factors leading to more severe injuries is very important in addressing the problem of pedestrian safety. This research combines data mining and statistical regression methods to identify the main factors associated with the levels of pedestrian injury severity outcomes. This work relies on the analysis of two unique pedestrian injury severity datasets from New York City, US (2002–2006) and the City of Montreal, Canada (2003–2006). General injury severity models were estimated for each dataset and for sub-populations obtained through clustering analysis. This paper shows how the segmentation of the accident datasets helps to better understand the complex relationship between the injury severity outcomes and the contribution of geometric, built environment and socio-demographic factors. While using the same methodology for the two datasets, different techniques were tested. Within the New York dataset, a latent class with ordered probit method provides the best results. However, for Montreal, K-means with a multinomial logit model proves most appropriate. Among other results, it was found that pedestrian age, location type, driver age, vehicle type, driver alcohol involvement, lighting conditions, and several built environment characteristics influence the likelihood of fatal crashes. Finally, the research provides recommendations for policy makers, traffic engineers, and law enforcement in order to reduce the severity of pedestrian–vehicle collisions.

[1]  J. Vermunt,et al.  Latent class cluster analysis , 2002 .

[2]  S. Washington,et al.  Statistical and Econometric Methods for Transportation Data Analysis , 2010 .

[3]  Samiul Hasan,et al.  Exploring the determinants of pedestrian-vehicle crash severity in New York City. , 2013, Accident; analysis and prevention.

[4]  D. Hensher,et al.  A mixed generalized ordered response model for examining pedestrian and bicyclist injury severity level in traffic crashes. , 2008, Accident; analysis and prevention.

[5]  K. Do,et al.  Combining non-parametric models with logistic regression: an application to motor vehicle injury data , 2000 .

[6]  Eric Yamashita,et al.  Using a K-means clustering algorithm to examine patterns of pedestrian involved crashes in Honolulu, Hawaii , 2007 .

[7]  Satish V. Ukkusuri,et al.  The role of built environment on pedestrian crash frequency , 2012 .

[8]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[9]  Simon Jackman,et al.  Models for Ordered Outcomes , 2000 .

[10]  Li-Yen Chang,et al.  Analysis of traffic injury severity: an application of non-parametric classification tree techniques. , 2006, Accident; analysis and prevention.

[11]  Christopher Winship,et al.  Logit and Probit: Ordered and Multinomial Models , 2003 .

[12]  Geert Wets,et al.  Traffic accident segmentation by means of latent class clustering. , 2008, Accident; analysis and prevention.

[13]  Luis F. Miranda-Moreno,et al.  Estimating Potential Effect of Speed Limits, Built Environment, and Other Factors on Severity of Pedestrian and Cyclist Injuries in Crashes , 2011 .

[14]  Carolina Burnier,et al.  Severity of injury resulting from pedestrian-vehicle crashes: What can we learn from examining the built environment? , 2009 .

[15]  Vanishree K. null,et al.  Logit and Probit: Ordered and Multinomial Models , 2001 .

[16]  Shlomo Bekhor,et al.  Exploring the potential of data mining techniques for the analysis of accident patterns , 2010 .

[17]  N N Sze,et al.  Diagnostic analysis of the logistic model for pedestrian injury severity in traffic crashes. , 2007, Accident; analysis and prevention.

[18]  Liping Fu,et al.  A latent class modeling approach for identifying vehicle driver injury severity factors at highway-railway crossings. , 2012, Accident; analysis and prevention.

[19]  Mohamed Abdel-Aty,et al.  Comprehensive analysis of vehicle-pedestrian crashes at intersections in Florida. , 2005, Accident; analysis and prevention.

[20]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[21]  R. Tay,et al.  A Multinomial Logit Model of Pedestrian–Vehicle Crash Severity , 2011 .

[22]  Kelvin K W Yau,et al.  Risk factors affecting the severity of single vehicle traffic accidents in Hong Kong. , 2004, Accident; analysis and prevention.