Statistical Issues in Traffic Accident Modeling

Accident prediction models are invaluable tools that have many applications in road safety analysis. However, there are certain statistical issues related to accident modeling that either deserve further attention or have not been dealt with adequately in the road safety literature. This paper discusses and illustrates how to deal with two statistical issues related to modeling accidents using Poisson and negative binomial regression. The first issue is that of model building or deciding which explanatory variables to include in an accident prediction model. The study differentiates between applications for which it is advisable to avoid model over-fitting and other applications for which it is desirable to fit the model to the data as closely as possible. It then suggests procedures for developing parsimonious models, i.e. models that are not overfitted, and best-fit models. The second issue discussed in the paper is that of outlier analysis. The study suggests a procedure for the identification and exclusion of extremely influential outliers from the development of Poisson and negative binomial regression models. The procedures suggested for model building and conducting outlier analysis are more straightforward to apply in the case of Poisson regression models due to an added complexity presented by the shape parameter of the negative binomial distribution. The paper, therefore, presents flowcharts detailing the application of the procedures when modeling is carried out using negative binomial regression. The described procedures are then applied in the development of negative binomial accident prediction models for the urban arterials of the cities of Vancouver and Richmond located in the province of British Columbia, Canada. TRB 2003 Annual Meeting CD-ROM Paper revised from original submittal.