Creating a large database test bed with typographical errors for record linkage evaluation.

Evaluation of record linkage algorithms requires a large database test bed that is representative of the real-world data. We created such a large database that reflects the demographic distribution of a typical population and contains typographical errors commonly made during data entry. This database can be used with high confidence as a test bed to evaluate various record linkage algorithms.