Contribution of Record Linkage to Vital Status Determination in Cancer Patients

To assess the performance of vital status determination by record linkage between a hospital database and the French national mortality database with anonymised data in order to adhere to French legislation. Hospital database of the Institut Gustave Roussy (IGR), the largest cancer centre in France, and the French mortality databases from 1998-2004 were used for this record linkage. A phonetic code adapted to French language was first applied to identifiers. The last name, maiden name, all first names and the date of birth were then each rendered anonymous using irreversible hash coding. Record linkage, using the probabilistic method developed by Jaro, was based on four fields: the last name, first given name, date of birth and code of birth place. Other variables were used for further automatic and manual validation. Linkage results were very satisfactory for the 10,089 patients included: sensitivity was 94.8% and specificity 99.5%. The positive and negative likelihood ratios were respectively 190 and 0.05. The main causes of discordances were erroneous or incomplete information such as unrecorded maiden name in the hospital database. Results were improved by adding manual validation to electronic matching: sensitivity rose to 97.2% and specificity to 99.4%. Record linkage using anonymised data applied to large scale hospital data is possible and has good validity. This method offers new prospects for large prognostic studies based on hospital data provided that the diagnosis date is systematically recorded in the hospital database.