Syndromic surveillance on the Victorian chief complaint data set using a hybrid statistical and machine learning technique

Emergency Department Chief Complaints have been used to detect the size and the spread of disease outbreaks in the past. Chief complaints are readily available in digital formats and provide a good data source for syndromic surveillance. This paper reports our findings on the identification of the distribution of a few syndromes over time using the Victorian Syndromic Surveillance (SynSurv) data set. We utilized a machine learning-based Näıve Bayes classifier to predict the syndromic group of unseen chief complaints. Then, we analyzed the patterns of the distributions of three syndromes in the SynSurv data, specifically the Flu-like Illness, Acute Respiratory, and Diarrhoea syndromes, over sliding windows of time using the EARS C1, C2, and C3 aberrancy detection algorithms. The results of our analyses demonstrate that applying aberrancy algorithms over the variance data between two consecutive weeks reduces the large number of possible disease outbreaks detected using raw frequencies of the syndromic groups in the same time period, resulting in a more feasible approach for practical syndromic surveillance.

[1]  W. Chapman,et al.  Using chief complaints for syndromic surveillance: A review of chief complaint based classifiers in North America , 2013, J. Biomed. Informatics.

[2]  I. Barr,et al.  Annual report of the National Influenza Surveillance Scheme, 2008. , 2010, Communicable diseases intelligence quarterly report.

[3]  S. Tu,et al.  Model Formulation: Understanding Detection Performance in Public Health Surveillance: Modeling Aberrancy-detection Algorithms , 2008, J. Am. Medical Informatics Assoc..

[4]  Ronald D Fricker,et al.  Comparing syndromic surveillance detection methods: EARS' versus a CUSUM‐based methodology , 2008, Statistics in medicine.

[5]  I. Barr,et al.  Annual report of the National Influenza Surveillance Scheme, 2007. , 2008, Communicable diseases intelligence quarterly report.

[6]  Yiliang Zhu,et al.  Initial evaluation of the early aberration reporting system--Florida. , 2005, MMWR supplements.

[7]  Colleen A Bradley,et al.  BioSense: implementation of a National Early Event Detection and Situational Awareness System. , 2005, MMWR supplements.

[8]  Lori Hutwagner,et al.  Comparing Aberration Detection Methods with Simulated Data , 2005, Emerging infectious diseases.

[9]  K. Henning,et al.  What is syndromic surveillance? , 2004, MMWR supplements.

[10]  L. Hutwagner,et al.  The bioterrorism preparedness and response Early Aberration Reporting System (EARS) , 2003, Journal of Urban Health.

[11]  Mark D. Smucker,et al.  Information Retrieval , 2017, Lecture Notes in Computer Science.

[12]  Karin M. Verspoor,et al.  Towards Early Discovery of Salient Health Threats: A Social Media Emotion Classification Technique , 2016, PSB.

[13]  Karin M. Verspoor,et al.  Assessing the performance of American chief complaint classifiers on Victorian syndromic surveillance data , 2015 .

[14]  Gregory F. Cooper,et al.  SyCo: A Probabilistic Machine Learning Method for Classifying Chief Complaints into Symptom and Syndrome Categories , 2006 .

[15]  I. Barr,et al.  Annual report of the National Influenza Surveillance Scheme, 2005. , 2006, Communicable diseases intelligence quarterly report.

[16]  Peter J. Haug,et al.  Classifying free-text triage chief complaints into syndromic categories with natural language processing , 2005, Artif. Intell. Medicine.

[17]  Zhen Liu,et al.  The RODS Open Source Project: Removing a Barrier to Syndromic Surveillance , 2004, MedInfo.

[18]  M. Wagner,et al.  Application of Information Technology: Technical Description of RODS: A Real-time Public Health Surveillance System , 2003, J. Am. Medical Informatics Assoc..