Evaluating Natural Language Processing Applications Applied to Outbreak and Disease Surveillance

Much of the pre-existing electronic data that could be harnessed for early outbreak detection is in free-text format. Natural language processing (NLP) techniques may be useful to biosurveillance by classifying and extracting information described in freetext sources. In the Real-time Outbreak and Disease Surveillance laboratory we are developing and evaluating NLP techniques for surveillance of syndromic presentations and specific findings or diseases potentially caused by bioterroristic or naturally-occurring outbreaks. We have implemented a three-stage evaluation process to determine whether NLP techniques are useful for outbreak detection. First, we are evaluating the technical accuracy of the NLP techniques to answer the question “How well can we classify, extract, or encode relevant information from text?” Second, we are evaluating the diagnostic accuracy of the techniques to answer the question “How well can we diagnose patients of interest using the NLP techniques?” Third, we are evaluating the outcome efficacy of the techniques to answer the question “How well can we detect outbreaks with an NLP-based biosurveillance system?” We give examples from our research for all three levels of evaluation and conclude with suggestions for determining whether NLP is feasible for outbreak and disease surveillance. Introduction and Background The appearance of new infectious diseases (e.g., the outbreak of Severe Acute Respiratory Syndrome (SARS) in Asia and Toronto), the reemergence of old infectious diseases (e.g., tuberculosis outbreaks), and the deliberate introduction of infectious diseases through bioterrorism (e.g., October 2001 anthrax attacks) demonstrate the need for surveillance of infectious disease [1]. The United States has not been prepared to deal with biological attacks [2], and biodefense has quickly become a national priority [3]. In response to the need for better biodefense, several research groups have developed electronic surveillance systems [4-13] that monitor a variety of different sources of data including over-the-counter drug sales [14, 15], web-based physician entry of reports [16], 911 calls [17], consumer health hotline telephone calls [18, 19], and ambulatory care visit records [20-24]. Many of the systems monitor pre-existing electronic ED data [25] that typically include date of admission, sex, age, address, coded discharge diagnosis [22, 23, 26], and free-text triage chief complaint [21, 27-30]. Detection algorithms count the number of occurrences of a variable or a combination of variables in a given spatial location over a given time period to look for anomalous patterns [21, 31-33]. If the algorithms detect a significant increase in a given variable, such as the number of patients with a gastrointestinal illness, the detection algorithms alarm relevant medical and public health officials of a possible outbreak. The Real-time Outbreak and Disease Surveillance (RODS) system [34] is a biosurveillance system adherent to the CDC’s NEDSS standards [35] that was developed in 1999 at the University of Pittsburgh and is currently deployed in four states, including Pennsylvania, Utah, New Jersey, and Ohio. For over 100 hospitals in the four states, RODS collects real-time admission data, including age, sex, zip code, and triage chief complaint. Time-series detection algorithms are applied to the information in the database, and the counts of patients with seven types of syndromic presentations are shown in graphical form on the user interface, shown in Figure 1. The interface also includes a geographic information system that shows counts of syndromic presentations by zip code. If the actual number of patients presenting with gastrointestinal complaints, for instance, exceeds the number expected in a given geographical location over a given time period, RODS’ notification subsystem sends an electronic alarm to a team of researchers and public health physicians for possible investigation. RODS software is currently open source [36] and is available for free download at www.health.pitt.edu/rods/sw. Input variables for the detection algorithms in RODS and other biosurveillance systems must be coded data, i.e., data stored in a format that can be interpreted by a computer. For example, a biosurveillance system may monitor the number of patients with Pneumonia, which may indicate a possible outbreak of Influenza, SARS, or inhalational Anthrax. Researchers in medical informatics have developed electronic diagnostic systems integrating multiple sources of data from a patient’s medical record to generate a probability that a patient has Pneumonia [37-39]. The variables required to determine the probability of Pneumonia include age and risk factors, vital signs, symptoms and physical findings, laboratory results, blood gas levels, and chest radiograph results. Values for Figure 1. Interface for Real-time Outbreak and Disease (RODS) system, showing syndromic classifications over a one-week period for all admissions in the specificed jurisdiction. some of the variables are stored in hospital information systems in coded format; however, variables involving a patient’s symptoms and physical findings, such as cough or adventitious respiratory sounds, and the results of a chest radiograph are usually stored as dictated reports in uncoded, free-text format. To use these variables for computerized decision support, the variables must be encoded from the textual reports. A physician reading the reports could easily determine the correct values for the variables. However, paying physicians to encode reports is impractical. A more feasible solution is applying natural language processing (NLP) techniques to convert the freetext data into an encoded representation that can be used for later inference [40]. Over the last few decades the medical informatics community has actively applied NLP techniques to the medical domain [41, 42]. The Linguistic String Project developed one of the first medical NLP systems that included comprehensive semantic and syntactic knowledge [43-49] that has also been ported to French and German [50-55]. Columbia Presbyterian Medical Center has evaluated and deployed a system called MedLEE [56-60] that extracts clinical information from radiology reports, discharge summaries, visit notes, electrocardiography, echocardiography, and pathology notes. MedLEE has been shown to be as accurate as physicians at extracting clinical concepts from chest radiograph reports [61, 62] and has been evaluated for a variety of applications including detecting patients with suspected tuberculosis [63-65], identifying findings suspicious for breast cancer [66], stroke [67], and community acquired Pneumonia [68], and deriving co morbidities from text [69]. Other medical informatics research groups have also created and evaluated NLP systems for extracting clinical information from medical texts and have shown them to be accurate in limited domains [70-86]. NLP techniques have been used for a variety of applications including quality assessment in radiology [87, 88], identification of structures in radiology images [89, 90], facilitation of structured reporting [72, 91] and order entry [92, 93], and encoding variables required by automated decision support systems such as guidelines [94], diagnostic systems [95], and antibiotic therapy alarms [96]. NLP has only recently been applied to the domain of outbreak and disease surveillance, and most of the research has focused on processing free-text chief complaints recorded in the emergency department [97-102]. We have applied NLP techniques to chief complaints, ED reports, and chest radiograph reports in order to acquire coded variables that may be useful in outbreak detection. To quantify the value of NLP in the domain of biosurveillance, we have adapted a hierarchical model of technology assessment from the domain of medical imaging, described by Thornbury and Fryback [103]. First, we have evaluated the technical accuracy of our NLP techniques to answer the question “How well can we classify, extract, or encode relevant information from text?” Second, we have evaluated the diagnostic accuracy of the techniques to answer the question “How well can we diagnose patients of interest using the NLP techniques?” Third, we have evaluated the outcome efficacy of the techniques to answer the question “How well can we detect outbreaks with an NLP-based biosurveillance system?” In the Methods section we describe the three levels of evaluation, using the hypothetical example of Pneumonia surveillance as an example. We briefly describe studies we have performed to evaluate NLP technologies for all three levels of evaluation and provide references for details about the studies. In the Results section we provide results from our research for the three levels of evaluation. In the Discussion section, we discuss implications of our findings and suggest three points to consider when appraising the feasibility of applying NLP to the domain of outbreak detection.

[1]  Gregory F Cooper,et al.  Research Paper: Creating a Text Classifier to Detect Radiology Reports Describing Mediastinal Findings Associated with Inhalational Anthrax and Other Disorders , 2003, J. Am. Medical Informatics Assoc..

[2]  George Hripcsak,et al.  Automating a severity score guideline for community-acquired pneumonia employing medical language processing of discharge summaries , 1999, AMIA.

[3]  George Hripcsak,et al.  A comparison of the Charlson comorbidities derived from medical language processing and administrative data , 2002, AMIA.

[4]  Andrew W. Moore,et al.  Data, network, and application: technical description of the Utah RODS Winter Olympic Biosurveillance System , 2002, AMIA.

[5]  Georges De Moor,et al.  Medical Language Processing applied to extract clinical information from Dutch medical documents , 1998, MedInfo.

[6]  Scott P. Narus,et al.  Using natural language processing to analyze physician modifications to data entry templates , 2002, AMIA.

[7]  Christian Lovis,et al.  A light knowledge model for linguistic applications , 2001, AMIA.

[8]  George Hripcsak,et al.  Automated encoding of clinical documents based on natural language processing. , 2004, Journal of the American Medical Informatics Association : JAMIA.

[9]  L M Lau,et al.  A natural language understanding system combining syntactic and semantic techniques. , 1994, Proceedings. Symposium on Computer Applications in Medical Care.

[10]  Peter J. Haug,et al.  A natural language parsing system for encoding admitting diagnoses , 1997, AMIA.

[11]  Peter J. Haug,et al.  Automatic Identification of Patients Eligible for a Pneumonia Guideline: Comparing the Diagnostic Accuracy of Two Decision Support Models , 2001, MedInfo.

[12]  Wendy W. Chapman,et al.  Fever detection from free-text clinical records for biosurveillance , 2004, Journal of Biomedical Informatics.

[13]  Bruce G. Buchanan,et al.  Identifying patient subgroups with simple Bayes' , 1999, AMIA.

[14]  Martin Romacker,et al.  MedSynDikate - a natural language system for the extraction of medical information from findings reports , 2002, Int. J. Medical Informatics.

[15]  Werner Ceusters,et al.  From syntactic-semantic tagging to knowledge discovery in medical texts , 1998, Int. J. Medical Informatics.

[16]  Andrew W. Moore,et al.  Rule-based anomaly pattern detection for detecting disease outbreaks , 2002, AAAI/IAAI.

[17]  G Hripcsak,et al.  Natural language processing and its future in medicine. , 1999, Academic medicine : journal of the Association of American Medical Colleges.

[18]  Peter J. Haug,et al.  Rapid deployment of an electronic disease surveillance system in the state of Utah for the 2002 Olympic Winter Games , 2002, AMIA.

[19]  Carol Friedman,et al.  Identification of findings suspicious for breast cancer based on natural language processing of mammogram reports , 1997, AMIA.

[20]  Richard Platt,et al.  Use of Automated Ambulatory-Care Encounter Records for Detection of Acute Illness Clusters, Including Potential Bioterrorism Events , 2002, Emerging infectious diseases.

[21]  George Hripcsak,et al.  A Health Information Network for Managing Innercity Tuberculosis: Bridging Clinical Care, Public Health, and Home Care, , 1999, Comput. Biomed. Res..

[22]  J. Austin,et al.  Use of natural language processing to translate clinical information from a database of 889,921 chest radiographic reports. , 2002, Radiology.

[23]  J. Rodman,et al.  Using nurse hot line calls for disease surveillance. , 1998, Emerging infectious diseases.

[24]  Peter J. Haug,et al.  Automatic extraction of PIOPED interpretations from ventilation/perfusion lung scan reports , 1998, AMIA.

[25]  Martin Romacker,et al.  MedSynDiKATe-design considerations for an ontology-based medical text understanding system , 2000, AMIA.

[26]  Vickie L. O’Dell,et al.  Recognition of illness associated with the intentional release of a biologic agent. , 2001, MMWR. Morbidity and mortality weekly report.

[27]  Weng-Keen Wong,et al.  Bayesian Biosurveillance of Disease Outbreaks , 2004, UAI.

[28]  E C Chi,et al.  Relational data base modelling of free-text medical narrative. , 1983, Medical informatics = Medecine et informatique.

[29]  P J Haug,et al.  Quantifying the characteristics of unambiguous chest radiography reports in the context of pneumonia. , 2001, Academic radiology.

[30]  Robert T. Olszewski Bayesian Classification of Triage Diagnoses for the Early Detection of Epidemics , 2003, FLAIRS.

[31]  Stephanie W. Haas,et al.  Using nurses' natural language entries to build a concept-oriented terminology for patients' chief complaints in the emergency department , 2003, J. Biomed. Informatics.

[32]  Wendy W. Chapman,et al.  Identifying Respiratory Findings in Emergency Department Reports for Biosurveillance using MetaMap , 2004, MedInfo.

[33]  Ricky K. Taira,et al.  A statistical natural language processor for medical reports , 1999, AMIA.

[34]  Peter Cameron,et al.  A major outbreak of severe acute respiratory syndrome in Hong Kong. , 2003, The New England journal of medicine.

[35]  N L Jain,et al.  Identification of suspected tuberculosis patients based on natural language processing of chest radiograph reports. , 1996, Proceedings : a conference of the American Medical Informatics Association. AMIA Fall Symposium.

[36]  Carol Friedman,et al.  Research Paper: A General Natural-language Text Processor for Clinical Radiology , 1994, J. Am. Medical Informatics Assoc..

[37]  C Lovis,et al.  A toolset for medical text processing. , 2000, Studies in health technology and informatics.

[38]  Jonathan Burstein,et al.  Usage of a web-based decision support tool for bioterrorism detection. , 2002, The American journal of emergency medicine.

[39]  Ralph Grishman Implementation of the string parser of English , 1973 .

[40]  Torsten Staab,et al.  The Rapid Syndrome Validation Project (RSVP) , 2001, AMIA.

[41]  R K Taira,et al.  Image content extraction: application to MR images of the brain. , 2001, Radiographics : a review publication of the Radiological Society of North America, Inc.

[42]  George Hripcsak,et al.  An evaluation of natural language processing methodologies , 1998, AMIA.

[43]  Peter J. Haug,et al.  Research Paper: Automatic Detection of Acute Bacterial Pneumonia from Chest X-ray Reports , 2000, J. Am. Medical Informatics Assoc..

[44]  Usha Sinha,et al.  Structured Reporting in Neuroradiology , 2002, Annals of the New York Academy of Sciences.

[45]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[46]  Bruce G. Buchanan,et al.  Using computer modeling to help identify patient subgroups in clinical data repositories , 1998, AMIA.

[47]  Peter J. Haug,et al.  Combining decision support methodologies to diagnose pneumonia , 2001, AMIA.

[48]  Neal Conrad Oliver A sublanguage based medical language processing system for German , 1992 .

[49]  Peter J. Haug,et al.  Automatic identification of patients eligible for a pneumonia guideline , 2000, AMIA.

[50]  Naomi Sager,et al.  Natural Language Information Processing: A Computer Grammar of English and Its Applications , 1980 .

[51]  Y. Elbert,et al.  Disease outbreak detection system using syndromic data in the greater Washington DC area. , 2002, American journal of preventive medicine.

[52]  R. Platt,et al.  A generalized linear mixed models approach for detecting incident clusters of disease in small areas, with an application to biological terrorism. , 2004, American journal of epidemiology.

[53]  P. Haug,et al.  Computerized extraction of coded findings from free-text radiologic reports. Work in progress. , 1990, Radiology.

[54]  C. P. Quinn,et al.  Bioterrorism-related inhalational anthrax: the first 10 cases reported in the United States. , 2001, Emerging infectious diseases.

[55]  J R Thornbury,et al.  Technology assessment--an American view. , 1992, European journal of radiology.

[56]  Ricky K. Taira,et al.  Structure localization in brain images: application to relevant image selection , 2001, AMIA.

[57]  W. DuMouchel,et al.  Unlocking Clinical Data from Narrative Reports: A Study of Natural Language Processing , 1995, Annals of Internal Medicine.

[58]  H Kangarloo,et al.  Interactive software for generation and visualization of structured findings in radiology reports. , 2000, AJR. American journal of roentgenology.

[59]  Kenneth D. Mandl,et al.  Time series modeling for syndromic surveillance , 2003, BMC Medical Informatics Decis. Mak..

[60]  Payne,et al.  Evaluation of a Command-line Parser-based Order Entry Pathway for the Department of Veterans Affairs Electronic Patient Record , 2001 .

[61]  Michael M. Wagner,et al.  Accuracy of ICD-9-coded chief complaints and diagnoses for the detection of acute respiratory illness , 2001, AMIA.

[62]  George Hripcsak,et al.  Coding Neuroradiology Reports for the Northern Manhattan Stroke Study: A Comparison of Natural Language Processing and Manual Review , 2000, Comput. Biomed. Res..

[63]  Peter J. Haug,et al.  Classifying free-text triage chief complaints into syndromic categories with natural language processing , 2005, Artif. Intell. Medicine.

[64]  Ralph Grishman,et al.  The linguistic string parser , 1973, AFIPS National Computer Conference.

[65]  Peter J. Haug,et al.  Using medical language processing to support real-time evaluation of pneumonia guidelines , 2000, AMIA.

[66]  Michael M. Wagner,et al.  Technical Description of RODS: A Real-time Public Health Surveillance System , 2003, Journal of the American Medical Informatics Association.

[67]  Carol Friedman,et al.  A broad-coverage natural language processing system , 2000, AMIA.

[68]  Michael M. Wagner,et al.  Detection of Pediatric Respiratory and Gastrointestinal Outbreaks from Free-Text Chief Complaints , 2003, AMIA.

[69]  Jean-Raoul Scherrer,et al.  Medical Language Processing for Knowledge Representation and Retrievals. , 1989 .

[70]  Wendy W. Chapman,et al.  Accuracy of three classifiers of acute gastrointestinal syndrome for syndromic surveillance , 2002, AMIA.

[71]  Werner Ceusters,et al.  Syntactic-semantic tagging as a mediator between linguistic representations and formal models: an exercise in linking SNOMED to GALEN , 1999, Artif. Intell. Medicine.

[72]  Peter Spyns Natural Language Processing in Medicine: An Overview , 1996, Methods of Information in Medicine.

[73]  R. Platt,et al.  Using automated medical records for rapid identification of illness syndromes (syndromic surveillance): the example of lower respiratory infection , 2001, BMC public health.

[74]  Martin Romacker,et al.  Creating Knowledge Repositories from Biomedical Reports: The MEDSYNDIKATE Text Mining System , 2001, Pacific Symposium on Biocomputing.

[75]  G De Moor,et al.  From natural language to formal language: when MultiTALE meets GALEN. , 1997, Studies in health technology and informatics.

[76]  Nobuhiko Okabe,et al.  [An evaluation of syndromic surveillance for the G8 Summit in Miyazaki and Fukuoka, 2000]. , 2002, Kansenshogaku zasshi. The Journal of the Japanese Association for Infectious Diseases.

[77]  William B. Lober,et al.  Roundtable on bioterrorism detection: information system-based surveillance. , 2002, Journal of the American Medical Informatics Association : JAMIA.

[78]  U Hahn,et al.  Semantic analysis of medical free texts. , 2000, Studies in health technology and informatics.

[79]  Dori B. Reissman,et al.  Clinical features that discriminate inhalational anthrax from other acute respiratory illnesses. , 2003, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[80]  P J Haug,et al.  Experience with a mixed semantic/syntactic parser. , 1995, Proceedings. Symposium on Computer Applications in Medical Care.

[81]  Hongfang Liu,et al.  Representing information in patient reports using natural language processing and the extensible markup language. , 1999, Journal of the American Medical Informatics Association : JAMIA.

[82]  Manfred S. Green,et al.  Surveillance for early detection and monitoring of infectious disease outbreaks associated with bioterrorism. , 2002, The Israel Medical Association journal : IMAJ.

[83]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[84]  D. Siegrist,et al.  The threat of biological attack: why concern now? , 1999, Emerging infectious diseases.

[85]  C. Irvin,et al.  Syndromic analysis of computerized emergency department patients' chief complaints: an opportunity for bioterrorism and influenza surveillance. , 2003, Annals of emergency medicine.

[86]  N L Jain,et al.  Respiratory Isolation of Tuberculosis Patients Using Clinical Guidelines and an Automated Clinical Decision Support System , 1998, Infection Control & Hospital Epidemiology.

[87]  L D Crook,et al.  Plague. A clinical review of 27 cases. , 1992, Archives of internal medicine.

[88]  N Sager,et al.  Developing a database from free-text clinical data. , 1983, Journal of clinical computing.

[89]  James F. Allen Natural language understanding , 1987, Bejnamin/Cummings series in computer science.

[90]  Peter J. Haug,et al.  An integrated decision support system for diagnosing and managing patients with community-acquired pneumonia , 1999, AMIA.

[91]  R K Griffiths,et al.  Can calls to NHS Direct be used for syndromic surveillance? , 2001, Communicable disease and public health.

[92]  Galit Shmueli,et al.  Early statistical detection of anthrax outbreaks by tracking over-the-counter medication sales , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[93]  Peter J. Haug,et al.  MPLUS: a probabilistic medical language understanding system , 2002, ACL Workshop on Natural Language Processing in the Biomedical Domain.

[94]  S. Soderland,et al.  Automatic structuring of radiology free-text reports. , 2001, Radiographics : a review publication of the Radiological Society of North America, Inc.

[95]  William B. Lober,et al.  Emergency Department Data for Bioterrorism Surveillance: Electronic Data Availability, Timeliness, Sources and Standards , 2003, American Medical Informatics Association Annual Symposium.