Identification of pneumonia and influenza deaths using the death certificate pipeline

BackgroundDeath records are a rich source of data, which can be used to assist with public surveillance and/or decision support. However, to use this type of data for such purposes it has to be transformed into a coded format to make it computable. Because the cause of death in the certificates is reported as free text, encoding the data is currently the single largest barrier of using death certificates for surveillance. Therefore, the purpose of this study was to demonstrate the feasibility of using a pipeline, composed of a detection rule and a natural language processor, for the real time encoding of death certificates using the identification of pneumonia and influenza cases as an example and demonstrating that its accuracy is comparable to existing methods.ResultsA Death Certificates Pipeline (DCP) was developed to automatically code death certificates and identify pneumonia and influenza cases. The pipeline used MetaMap to code death certificates from the Utah Department of Health for the year 2008. The output of MetaMap was then accessed by detection rules which flagged pneumonia and influenza cases based on the Centers of Disease and Control and Prevention (CDC) case definition. The output from the DCP was compared with the current method used by the CDC and with a keyword search. Recall, precision, positive predictive value and F-measure with respect to the CDC method were calculated for the two other methods considered here. The two different techniques compared here with the CDC method showed the following recall/ precision results: DCP: 0.998/0.98 and keyword searching: 0.96/0.96. The F-measure were 0.99 and 0.96 respectively (DCP and keyword searching). Both the keyword and the DCP can run in interactive form with modest computer resources, but DCP showed superior performance.ConclusionThe pipeline proposed here for coding death certificates and the detection of cases is feasible and can be extended to other conditions. This method provides an alternative that allows for coding free-text death certificates in real time that may increase its utilization not only in the public health domain but also for biomedical researchers and developers.Trial RegistrationThis study did not involved any clinical trials.

[1]  M. J. Hall,et al.  National Hospital Discharge Survey: 2007 summary. , 2010, National health statistics reports.

[2]  J. Giesecke,et al.  Description of a New all Cause Mortality Surveillance System in Sweden as a Warning System Using Threshold Detection Algorithms , 2005, European Journal of Epidemiology.

[3]  Olivier Bodenreider Language System ( UMLS ) : integrating biomedical terminology , .

[4]  H. Kelly,et al.  ICD-10 codes are a valid tool for identification of pneumonia in hospitalized patients aged ⩾65 years , 2007, Epidemiology and Infection.

[5]  M. Fine,et al.  Community-Acquired Pneumonia: Can It Be Defined with Claims Data? , 1997, American journal of medical quality : the official journal of the American College of Medical Quality.

[6]  A. Carter,et al.  Public health surveillance: historical origins, methods and evaluation. , 1994, Bulletin of the World Health Organization.

[7]  Shuying Shen,et al.  Using NLP on VA Electronic Medical Records to Facilitate Epidemiologic Case Investigations , 2008 .

[8]  Qing Zeng-Treitler,et al.  Research Paper: A Frequency-based Technique to Improve the Spelling Suggestion Rank in Medical Queries , 2004, J. Am. Medical Informatics Assoc..

[9]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[10]  Lone Simonsen,et al.  A Method for Timely Assessment of Influenza‐Associated Mortality in the United States , 1997, Epidemiology.

[11]  Keiji Fukuda,et al.  Influenza-associated hospitalizations in the United States. , 2004, JAMA.

[12]  D. Muscatello,et al.  Prospective surveillance of excess mortality due to influenza in New South Wales: feasibility and statistical approach. , 2008, Communicable diseases intelligence quarterly report.

[13]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[14]  Peter J. Haug,et al.  Classifying free-text triage chief complaints into syndromic categories with natural language processing , 2005, Artif. Intell. Medicine.

[15]  Michael Hogarth,et al.  Using the UMLS and Simple Statistical Methods to Semantically Categorize Causes of Death on Death Certificates. , 2010, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[16]  Influenza fact sheet. , 2003, Releve epidemiologique hebdomadaire.

[17]  Jon L. Schlossberg,et al.  Book review: Medical Language Processing: Computer Management of Narrative Data by Naomi Sager, Carol Friedman, and Margaret S. Lyman (Addison-Wesley 1987) , 1988, SGCH.

[18]  Wendy W. Chapman,et al.  Evaluating Natural Language Processing Applications Applied to Outbreak and Disease Surveillance , 2004 .

[19]  B. Nunes,et al.  The new automated daily mortality surveillance system in Portugal. , 2010, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[20]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[21]  Allen C. Browne,et al.  UMLS language and vocabulary tools. , 2003, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[22]  Peter J. Haug,et al.  Research Paper: Automatic Detection of Acute Bacterial Pneumonia from Chest X-ray Reports , 2000, J. Am. Medical Informatics Assoc..

[23]  Roden At Communicable Disease Surveillance. , 1971, Communicable diseases intelligence.

[24]  A T Roden,et al.  Communicable Disease Surveillance , 1966, Proceedings of the Royal Society of Medicine.

[25]  N. Cox,et al.  Prevention and control of seasonal influenza with vaccines: recommendations of the Advisory Committee on Immunization Practices (ACIP), 2009. , 2009 .

[26]  Keiji Fukuda,et al.  Mortality associated with influenza and respiratory syncytial virus in the United States. , 2003, JAMA.

[27]  J C Jager,et al.  Economic evaluation of influenza vaccination. Assessment for The Netherlands. , 1999, PharmacoEconomics.

[28]  Toward an electronic death registration system in the United States: report of the Steering Committee to Reengineer the Death Registration Process. , 1998, The American journal of forensic medicine and pathology.

[29]  D. J. Pereira Gray MORTALITY SURVEILLANCE 1968-1976 ENGLAND AND WALES. , 1979 .

[30]  A Mazick Monitoring excess mortality for public health action: potential for a future European network. , 2007, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[31]  N. Cox,et al.  Prevention and Control of Influenza: recommendations of the Advisory Committee on Immunization Practices (ACIP). , 2006, MMWR. Recommendations and reports : Morbidity and mortality weekly report. Recommendations and reports.

[32]  H M Rosenberg,et al.  Comparability of cause of death between ICD-9 and ICD-10: preliminary estimates. , 2001, National vital statistics reports : from the Centers for Disease Control and Prevention, National Center for Health Statistics, National Vital Statistics System.