Exploring the Utility of Demographic Data and Vaccination History Data in the Deduplication of Immunization Registry Patient Records

Duplicate patient records pose a major problem for many immunization registries, as well as for many electronic patient record systems. This paper reports two complementary studies exploring the deduplication of immunization registry records. One study explores the utility of different demographic data elements, singly and in combination, to assist in the deduplication process. The second study explores how clinical patient data (vaccination history data) might assist in this process. To assess the utility of demographic data elements, data were used from three registries after duplicates had been identified. A computer program, IMM/Scan, was written to count the number of true-positive (TP) matches and false-positive (FP) matches found when using different Boolean combinations of demographic data elements. In this study, a strategy of "ORing high value ANDed pairs of data elements" appeared to be most powerful. To assess the utility of vaccination history data, record pairs were drawn from 440,000 patient records. Two metrics on patient history were tested: (1) the number of identical doses shared by two records, and (2) the number of "extra" doses in the combined history of two records. In this study, sample findings include: (1) for pairs of nonduplicate records, 93% had no identical doses and 90.6% had "extra" doses, and (2) for pairs of duplicate records, 83.8% had one or more identical doses and 82% contained no "extra" doses. These studies demonstrate potentially useful approaches to using demographic data and patient history data to assist the automated deduplication of immunization patient records.