Categorical similarity comparison of ciren and nass.

In vehicle crash and injury databases, it is beneficial to have a similarity metric between databases so that cases can be compared and used in analyses. The Mahalanobis metric was used to quantitatively score the similarity between certain database entries in Crash Injury Research and Engineering Network (CIREN) cases and those same entries in the National Automotive Sampling System (NASS) population. One difficulty with this is that many fields within CIREN and NASS are non-ordinal in nature, requiring additional preprocessing prior to analysis. This study presents an implementation of the Mahalanobis metric for converting many non-ordinal discrete fields to ordinal fields via a preprocessing function specific to each field. The cases were split into categories and a subset of NASS cases was used as the population. The search region was defined to be a common crash scenario. Seven important fields from the analysis were utilized. The results of this specific analysis showed that the three most similar cases in CIREN were within the search region defined in NASS. Therefore, the Mahalanobis metric has been shown to be a viable similarity scoring system for non-ordinal NASS database entries.