High-Performance Computing Techniques for Record Linkage

The task of linking together information from one or more data sources representing the same entity (patient, customer, provider, business, etc.) If no unique identifier is available, probabilistic linkage techniques have to be applied Applications of record linkage Remove duplicates in a data set (internal linkage) Merge new records into a larger master data set Create patient oriented statistics Compile data for longitudinal studies Clean data sets for data mining projects or mailing lists