Crash data quality for road safety research: Current state and future directions.

Crash databases are one of the primary data sources for road safety research. Therefore, their quality is fundamental for the accuracy of crash analyses and, consequently the design of effective countermeasures. Although crash data often suffer from correctness and completeness issues, these are rarely discussed or addressed in crash analyses. Crash reports aim to answer the five "W" questions (i.e. When?, Where?, What?, Who? and Why?) of each crash by including a range of attributes. This paper reviews current literature on the state of crash data quality for each of these questions separately. The most serious data quality issues appear to be: inaccuracies in crash location and time, difficulties in data linkage (e.g. with traffic data) due to inconsistencies in databases, severity misclassification, inaccuracies and incompleteness of involved users' demographics and inaccurate identification of crash contributory factors. It is shown that the extent and the severity of data quality issues are not equal between attributes and the level of impact in road safety analyses is not yet entirely known. This paper highlights areas that require further research and provides some suggestions for the development of intelligent crash reporting systems.

[1]  Moinul Hossain,et al.  Understanding crash mechanism on urban expressways using high-resolution traffic data. , 2013, Accident; analysis and prevention.

[2]  A F Dove,et al.  Data collection from road traffic accidents. , 1986, Archives of emergency medicine.

[3]  Jane C Stutts,et al.  Driver risk factors for sleep-related crashes. , 2003, Accident; analysis and prevention.

[4]  Adika Mammadrahimli Assessment of crash location improvements in map-based geocoding systems and subsequent benefits to geospatial crash analysis , 2015 .

[5]  George Yannis,et al.  Modeling road accident injury under-reporting in Europe , 2014, European Transport Research Review.

[6]  Chandra R. Bhat,et al.  Analytic methods in accident research: Methodological frontier and future directions , 2014 .

[7]  David Pitfield,et al.  Multilevel Logistic Regression Modeling for Crash Mapping in Metropolitan Areas , 2015 .

[8]  Marko Ševrović,et al.  Extracting accurate location information from a highly inaccurate traffic accident dataset: A methodology based on a string matching technique , 2016 .

[9]  Mohamed A. Abdel-Aty,et al.  Calibrating a real-time traffic crash-prediction model using archived weather and ITS traffic data , 2006, IEEE Transactions on Intelligent Transportation Systems.

[10]  Michael G. Lenné,et al.  Corrigendum to “Driver inattention and driver distraction in serious casualty crashes: Data from the Australian National Crash In-depth Study” [Accid. Anal. Prev. 54C (2013) 99–107] , 2013 .

[11]  C. Farmer Reliability of Police-Reported Information for Determining Crash and Injury Severity , 2003, Traffic injury prevention.

[12]  Kibrom A. Abay,et al.  Investigating the Nature and Impact of Reporting Bias in Road Crash Data , 2014 .

[13]  Rune Elvik,et al.  Incomplete Accident Reporting: Meta-Analysis of Studies Made in 13 Countries , 1999 .

[14]  P Cummings,et al.  Association of seat belt use with death: a comparison of estimates based on data from police and estimates based on data from trained crash investigators , 2002, Injury prevention : journal of the International Society for Child and Adolescent Injury Prevention.

[15]  Naveen Eluru,et al.  Evaluating alternate discrete outcome frameworks for modeling crash injury severity. , 2013, Accident; analysis and prevention.

[16]  T. Golob,et al.  Relationships Among Urban Freeway Accidents, Traffic Flow, Weather and Lighting Conditions , 2001 .

[17]  Mohamed Abdel-Aty,et al.  Real-time prediction of visibility related crashes , 2012 .

[18]  Alfonso Montella,et al.  Identifying crash contributory factors at urban roundabouts and using association rules to explore their relationships to different crash types. , 2011, Accident; analysis and prevention.

[19]  Younshik Chung,et al.  How accurate is accident data in road safety research? An application of vehicle black box data regarding pedestrian-to-taxi accidents in Korea. , 2015, Accident; analysis and prevention.

[20]  F Sagberg,et al.  Road accidents caused by drivers falling asleep. , 1999, Accident; analysis and prevention.

[21]  Doug Beirness,et al.  Drugs and Driving: Detection and Deterrence , 2010 .

[22]  Michael McCarthy,et al.  Comparing motor-vehicle crash risk of EU and US vehicles. , 2018, Accident; analysis and prevention.

[23]  Neville A. Stanton,et al.  Human error taxonomies applied to driving: A generic driver error taxonomy and its implications for intelligent transport systems , 2009 .

[24]  David Pitfield,et al.  High accuracy crash mapping using fuzzy logic , 2014 .

[25]  Saleh Altwaijri,et al.  Analysing traffic crashes in Riyadh City using statistical models and geographic information systems , 2013 .

[26]  John Langley,et al.  Validity of Police-Reported Information on Injury Severity for Those Hospitalized from Motor Vehicle Traffic Crashes , 2009, Traffic injury prevention.

[27]  Salvatore Chiaradonna,et al.  Development and evaluation of a web-based software for crash data collection, processing and analysis. , 2017, Accident; analysis and prevention.

[28]  Mohammed Quddus,et al.  Network-level accident-mapping: Distance based pattern matching using artificial neural network. , 2014, Accident; analysis and prevention.

[29]  D Fife,et al.  Discrepancies in vehicular crash injury reporting: Northeastern Ohio Trauma Study. IV. , 1985, Accident; analysis and prevention.

[30]  Sigal Kaplan,et al.  Understanding traffic crash under-reporting: Linking police and medical records to individual and crash characteristics , 2014, Traffic injury prevention.

[31]  E. Orsay,et al.  The impaired driver: hospital and police detection of alcohol and other drugs of abuse in motor vehicle crashes. , 1994, Annals of emergency medicine.

[32]  K Austin,et al.  The identification of mistakes in road accident records: Part 1, Locational variables. , 1995, Accident; analysis and prevention.

[33]  T Litman,et al.  Blueprint 5: True Costs of Road Transport , 2000 .

[34]  Barry C. Watson,et al.  How accurate is the identification of serious traffic injuries by police? The concordance between police and hospital reported traffic injuries , 2013 .

[35]  Donald F. Huelke,et al.  A reappraisal of the use of police injury codes in accident data analysis , 1976 .

[36]  Emmanuelle Amoros,et al.  Under-reporting of road crash casualties in France. , 2006, Accident; analysis and prevention.

[37]  Michael Fitzharris,et al.  Driver inattention and driver distraction in serious casualty crashes: data from the Australian National Crash In-depth Study. , 2013, Accident; analysis and prevention.

[38]  D. Heitjan,et al.  Distinguishing “Missing at Random” and “Missing Completely at Random” , 1996 .

[39]  Jennifer Harper Ogle,et al.  Technologies for Improving Safety Data , 2007 .

[40]  Pedro Rangel Henriques,et al.  A Formal Definition of Data Quality Problems , 2005, ICIQ.

[41]  William E. Winkler,et al.  Data quality and record linkage techniques , 2007 .

[42]  Daryl Lloyd,et al.  Reported Road Casualties Great Britain: 2013 Annual Report , 2013 .

[43]  Toshiyuki Yamamoto,et al.  Underreporting in traffic accident data, bias in parameters and the structure of injury severity models. , 2008, Accident; analysis and prevention.

[44]  Becky P Y Loo,et al.  Validating crash locations for quantitative spatial analysis: a GIS-based approach. , 2006, Accident; analysis and prevention.

[45]  M. T Corfitsen,et al.  ‘Fatigue’ among young male night-time car drivers: is there a risk-taking group? , 1999 .

[46]  Mohammed Quddus,et al.  Re-visiting crash-speed relationships: A new perspective in crash modelling. , 2016, Accident; analysis and prevention.

[47]  J Langley,et al.  Under-reporting of motor vehicle traffic crash victims in New Zealand. , 2001, Accident; analysis and prevention.

[48]  David C Viano,et al.  Belt Use: Comparison of NASS-CDS and Police Crash Reports , 2009, Traffic injury prevention.

[49]  Jianming Ma P.E. Bayesian Analysis of Underreporting Poisson Regression Model with an Application to Traffic Crashes on Two-Lane Highways , 2009 .

[50]  Kirsten Vallmuur,et al.  Estimating under-reporting of road crash injuries to police using multiple linked data collections. , 2015, Accident; analysis and prevention.

[51]  Mohammed Salifu,et al.  Under-reporting of road traffic crash data in Ghana , 2012, International journal of injury control and safety promotion.

[52]  Mohamed Abdel-Aty,et al.  The Potential for Real-Time Traffic Crash Prediction , 2005 .

[53]  D. Lord,et al.  Investigation of Effects of Underreporting Crash Data on Three Commonly Used Traffic Crash Severity Models , 2011 .

[54]  Maria-Ioanna M. Imprialou,et al.  Developing accident-speed relationships using a new modelling approach , 2015 .

[55]  Andrew Morris,et al.  Some injury scaling issues in UK crash research , 2003 .

[56]  Wei Wang,et al.  A Genetic Programming Model for Real-Time Crash Prediction on Freeways , 2013, IEEE Transactions on Intelligent Transportation Systems.

[57]  António Couto,et al.  Reporting road victims: Assessing and correcting data issues through distinct injury scales. , 2016, Journal of safety research.

[58]  B J Campbell,et al.  ANALYSIS OF THE ACCURACY OF THE EXISTING KABCO INJURY SCALE , 1991 .

[59]  Paraskevi Michalaki,et al.  Exploring the factors affecting motorway accident severity in England using the generalised ordered logistic regression model. , 2015, Journal of safety research.

[60]  K Dobbie,et al.  FATIGUE-RELATED CRASHES: AN ANALYSIS OF FATIGUE-RELATED CRASHES ON AUSTRALIAN ROADS USING AN OPERATIONAL DEFINITION OF FATIGUE , 2002 .