The ngram chief complaint classifier: A novel method of automatically creating chief complaint classifiers based on international classification of diseases groupings

INTRODUCTION The ngram classifier is created by using text fragments to measure associations between chief complaints (CC) and a syndromic grouping of ICD-9-CM codes. OBJECTIVES For gastrointestinal (GI) syndrome to determine: (1) ngram CC classifier sensitivity/specificity. (2) Daily volumes for ngram CC and ICD-9-CM classifiers. METHODS DESIGN Retrospective cohort. SETTING 19 Emergency Departments. PARTICIPANTS Consecutive visits (1/1/2000-12/31/2005). PROTOCOL (1) Used an existing ICD-9-CM filter for "lower GI" to create the ngram CC classifier from a training set and then measured sensitivity/specificity in a test set using an ICD-9-CM classifier as criterion. (2) Compare daily volumes based on ICD-9-CM with that predicted by the ngram classifier. RESULTS For a specificity of 0.96, sensitivity was 0.70. The daily volume correlation for ngram vs. ICD-9-CM was R=0.92. CONCLUSION The ngram CC classifier performed similarly to manually developed CC classifiers and has advantages of rapid automated creation and updating, and may be used independent of language or dialect.

[1]  D. Bravata,et al.  A comparison of syndromic incidence data collected by triage nurses in Sata Clara county with regional infectious disease data , 2003, Journal of Urban Health.

[2]  B. Rowe Reasons Why Patients Leave without Being Seen from the Emergency Department , 2003 .

[3]  W. Chapman,et al.  Syndrome and outbreak detection using chief-complaint data--experience of the Real-Time Outbreak and Disease Surveillance project. , 2004, MMWR supplements.

[4]  Kenneth D Mandl,et al.  Use of Emergency Department Chief Complaint and Diagnostic Codes for Identifying Respiratory Illness in a Pediatric Population , 2004, Pediatric emergency care.

[5]  Wendy W Chapman,et al.  Classification of emergency department chief complaints into 7 syndromes: a retrospective analysis of 527,228 patients. , 2005, Annals of emergency medicine.

[6]  Peter J. Haug,et al.  Classifying free-text triage chief complaints into syndromic categories with natural language processing , 2005, Artif. Intell. Medicine.

[7]  J. Pavlin,et al.  Evaluation of ICD-9 codes for syndromic surveillance in the electronic surveillance system for the early notification of community-based epidemics. , 2007, Military medicine.

[8]  Ben Y. Reis,et al.  Syndromic surveillance: the effects of syndrome grouping on model accuracy and outbreak detection. , 2004, Annals of emergency medicine.

[9]  Andrew W. Moore,et al.  Data, network, and application: technical description of the Utah RODS Winter Olympic Biosurveillance System , 2002, AMIA.

[10]  William B. Lober,et al.  Emergency Department Data for Bioterrorism Surveillance: Electronic Data Availability, Timeliness, Sources and Standards , 2003, American Medical Informatics Association Annual Symposium.

[11]  J Silva,et al.  Comparison of two major emergency department-based free-text chief-complaint coding systems. , 2004, MMWR supplements.

[12]  L. Stone,et al.  Seasonal dynamics of recurrent epidemics , 2007, Nature.

[13]  Michael M. Wagner,et al.  Accuracy of ICD-9-coded chief complaints and diagnoses for the detection of acute respiratory illness , 2001, AMIA.

[14]  Lisa J. Trigg,et al.  Roundtable on Bioterrorism Detection , 2002 .

[15]  R. G. Parrish,et al.  Guidelines for evaluating surveillance systems. , 1988, MMWR supplements.

[16]  D. Buckeridge,et al.  Systematic Review: Surveillance Systems for Early Detection of Bioterrorism-Related Diseases , 2004, Annals of Internal Medicine.

[17]  William B. Lober,et al.  Roundtable on bioterrorism detection: information system-based surveillance. , 2002, Journal of the American Medical Informatics Association : JAMIA.

[18]  J. Casani,et al.  The National Capitol Region’s Emergency Department Syndromic Surveillance System: 
Do Chief Complaint and Discharge Diagnosis Yield Different Results? , 2003, Emerging infectious diseases.

[19]  Alan R Shapiro,et al.  Taming variability in free text: application to health surveillance. , 2004, MMWR supplements.

[20]  B. Ostrowsky,et al.  Should we be worried? Investigation of signals generated by an electronic syndromic surveillance system--Westchester County, New York. , 2004, MMWR supplements.

[21]  Wendy W. Chapman,et al.  Accuracy of three classifiers of acute gastrointestinal syndrome for syndromic surveillance , 2002, AMIA.

[22]  M. Mocny A Comparison of Two Methods for Biosurveillance of Respiratory Disease in the Emergency Department: Chief Complaint vs ICD9 Diagnosis Code , 2003 .

[23]  Torsten Staab,et al.  The Rapid Syndrome Validation Project (RSVP) , 2001, AMIA.

[24]  Stephanie W. Haas,et al.  Evaluation of emergency medical text processor, a system for cleaning chief complaint text data. , 2004, Academic emergency medicine : official journal of the Society for Academic Emergency Medicine.

[25]  K. Henning,et al.  What is syndromic surveillance? , 2004, MMWR supplements.

[26]  Stephanie W. Haas,et al.  Using nurses' natural language entries to build a concept-oriented terminology for patients' chief complaints in the emergency department , 2003, J. Biomed. Informatics.

[27]  Michael M. Wagner,et al.  Value of ICD-9-Coded Chief Complaints for Detection of Epidemics , 2002, J. Am. Medical Informatics Assoc..