On the feasibility of linking census samples to the National Death Index for epidemiologic studies: a progress report.

To test the feasibility of using large national probability samples provided by the US Census Bureau, a pilot project was initiated to link 230,000 Census-type records to the National Death Index (NDI). Using strict precautions to maintain the complete confidentiality of individual records, the Current Population Survey files of one month in 1973 and one month in 1978 were matched by computer to the 1979 NDI file. The basic question to be addressed was whether deaths so obtained are seriously underestimated when there is no Social Security Number (SSN) in the Census record. The search of the NDI file resulted in 5,542 matches of which about 1,800 appear to be "true positives" representing deaths, the remainder are "false positives." Of the deaths, 80 per cent would still have been detected without SSN in the Census record. The main reasons for missing deaths (false negatives) were discrepancies in the year of birth and in the given name. Assuming certain changes in the NDI matching algorithm, the 80 per cent figure could increase to 85 per cent or higher; however, this could also cause significant increases in the number of false positives. The National Heart, Lung and Blood Institute (NHLBI) and Census Bureau staff are currently developing a probabilistic method to eliminate false positives from the NDI output tape. The results of the pilot study indicate that a larger research project is clearly feasible.