Modelling the hierarchical structure of road crash data--application to severity analysis.

Road crashes have an unquestionably hierarchical crash-car-occupant structure. Multilevel models are used with correlated data, but their application to crash data can be difficult. The number of sub-clusters per cluster is small, with less than two cars per crash and less than two occupants per car, whereas the number of clusters can be high, with several hundred/thousand crashes. Application of the Monte-Carlo method on observed and simulated French road crash data between 1996 and 2000 allows comparing estimations produced by multilevel logistic models (MLM), Generalized Estimating Equation models (GEE) and logistic models (LM). On the strength of a bias study, MLM is the most efficient model while both GEE and LM underestimate parameters and confidence intervals. MLM is used as a marginal model and not as a random-effect model, i.e. only fixed effects are taken into account. Random effects allow adjusting risks on the hierarchical structure, conferring an interpretative advantage to MLM over GEE. Nevertheless, great care is needed for data coding and quite a high number of crashes are necessary in order to avoid problems and errors with estimates and estimate processes. On balance, MLM must be used when the number of vehicles per crash or the number of occupants per vehicle is high, when the LM results are questionable because they are not in line with the literature or finally when the p-values associated to risk measures are close to 5%. In other cases, LM remains a practical analytical tool for modelling crash data.

[1]  H. Goldstein Multilevel mixed linear model analysis using iterative generalized least squares , 1986 .

[2]  Stacey Knight,et al.  The use of generalized estimating equations in the analysis of motor vehicle crash data. , 2003, Accident; analysis and prevention.

[3]  Noreen Goldman,et al.  An assessment of estimation procedures for multilevel models with binary responses , 1995 .

[4]  Cora J. M. Maas,et al.  The Accuracy of Multilevel Structural Equation Modeling With Pseudobalanced Groups and Small Samples , 2001 .

[5]  Peter Cummings,et al.  Matched-pair cohort methods in traffic crash research. , 2003, Accident; analysis and prevention.

[6]  Andrew P Jones,et al.  The use of multilevel models for the prediction of road accident outcomes. , 2003, Accident; analysis and prevention.

[7]  N. Goldman,et al.  Improved estimation procedures for multilevel models with binary response: a case‐study , 2001 .

[8]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[9]  G. Y. Wong,et al.  The Hierarchical Logistic Regression Model for Multilevel Analysis , 1985 .

[10]  S. Lipsitz,et al.  Performance of generalized estimating equations in practical situations. , 1994, Biometrics.

[11]  Nicholas T. Longford Logistic regression with random coefficients , 1994 .

[12]  S I Bangdiwala,et al.  Repeated measures analysis of binary outcomes: applications to injury research. , 1996, Accident; analysis and prevention.

[13]  Iris Pigeot,et al.  Handbook of epidemiology , 2005 .

[14]  L Evans,et al.  Double pair comparison--a new method to determine how occupant characteristics affect fatality risk in traffic crashes. , 1986, Accident; analysis and prevention.

[15]  M Chavance [Modeling correlated data in epidemiology: mixed or marginal model?]. , 1999, Revue d'epidemiologie et de sante publique.

[16]  H. Goldstein Restricted unbiased iterative generalized least-squares estimation , 1989 .

[17]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[18]  H. Goldstein Nonlinear multilevel models, with an application to discrete response data , 1991 .

[19]  D. Hedeker,et al.  Random effects probit and logistic regression models for three-level data. , 1997, Biometrics.

[20]  P. Albert,et al.  Models for longitudinal data: a generalized estimating equation approach. , 1988, Biometrics.

[21]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[22]  Harvey Goldstein,et al.  Improved Approximations for Multilevel Models with Binary Responses , 1996 .

[23]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[24]  B Chaix,et al.  [The contribution of multilevel models in contextual analysis in the field of social epidemiology: a review of literature]. , 2002, Revue d'epidemiologie et de sante publique.

[25]  G. Guyatt,et al.  The independent contribution of driver, crash, and vehicle characteristics to driver fatalities. , 2002, Accident; analysis and prevention.

[26]  Harvey Goldstein,et al.  The MLwiN Command Interface , 2000 .

[27]  Jean-Louis Martin,et al.  Estimating relative driver fatality and injury risk according to some characteristics of cars and drivers using matched-pair multivariate analysis , 2003 .