A Simulation-Based Study on the Comparison of Statistical and Time Series Forecasting Methods for Early Detection of Infectious Disease Outbreaks

Early detection of infectious disease outbreaks is one of the important and significant issues in syndromic surveillance systems. It helps to provide a rapid epidemiological response and reduce morbidity and mortality. In order to upgrade the current system at the Korea Centers for Disease Control and Prevention (KCDC), a comparative study of state-of-the-art techniques is required. We compared four different temporal outbreak detection algorithms: the CUmulative SUM (CUSUM), the Early Aberration Reporting System (EARS), the autoregressive integrated moving average (ARIMA), and the Holt-Winters algorithm. The comparison was performed based on not only 42 different time series generated taking into account trends, seasonality, and randomly occurring outbreaks, but also real-world daily and weekly data related to diarrhea infection. The algorithms were evaluated using different metrics. These were namely, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), F1 score, symmetric mean absolute percent error (sMAPE), root-mean-square error (RMSE), and mean absolute deviation (MAD). Although the comparison results showed better performance for the EARS C3 method with respect to the other algorithms, despite the characteristics of the underlying time series data, Holt–Winters showed better performance when the baseline frequency and the dispersion parameter values were both less than 1.5 and 2, respectively.

[1]  G. Rossi,et al.  An approximate CUSUM procedure for surveillance of health events. , 1999, Statistics in medicine.

[2]  Andre Charlett,et al.  An Improved Algorithm for Outbreak Detection in Multiple Surveillance Systems , 2013, Statistics in medicine.

[3]  Ronald D Fricker,et al.  Comparing syndromic surveillance detection methods: EARS' versus a CUSUM‐based methodology , 2008, Statistics in medicine.

[4]  R. Allard Use of time-series analysis in infectious disease surveillance. , 1998, Bulletin of the World Health Organization.

[5]  Terence C. Mills,et al.  Time series techniques for economists , 1990 .

[6]  Rob J Hyndman,et al.  Automatic Time Series Forecasting: The forecast Package for R , 2008 .

[7]  A. Gilchrist,et al.  Long‐range forecasting , 1986 .

[8]  Kenneth D. Mandl,et al.  Time series modeling for syndromic surveillance , 2003, BMC Medical Informatics Decis. Mak..

[9]  Michael Höhle,et al.  surveillance: An R package for the monitoring of infectious diseases , 2007, Comput. Stat..

[10]  Fotios Petropoulos,et al.  forecast: Forecasting functions for time series and linear models , 2018 .

[11]  E. Hannan,et al.  Recursive estimation of mixed autoregressive-moving average order , 1982 .

[12]  D. Pierce,et al.  Residuals in Generalized Linear Models , 1986 .

[13]  C. Holt Author's retrospective on ‘Forecasting seasonals and trends by exponentially weighted moving averages’ , 2004 .

[14]  J. Scott Armstrong,et al.  Long-Range Forecasting, 2nd. Ed , 2010 .

[15]  Peter R. Winters,et al.  Forecasting Sales by Exponentially Weighted Moving Averages , 1960 .

[16]  Galit Shmueli,et al.  Automated time series forecasting for biosurveillance , 2007, Statistics in medicine.

[17]  Hsinchun Chen,et al.  Syndromic surveillance systems , 2008, Annu. Rev. Inf. Sci. Technol..

[18]  Y. Le Strat,et al.  Evaluation and comparison of statistical methods for early temporal detection of outbreaks: A simulation-based study , 2017, PloS one.

[19]  G. Garnett,et al.  The Impact of Population Growth on the Epidemiology and Evolution of Infectious Diseases , 2007, HIV, Resurgent Infections and Population Change in Africa.

[20]  Howard S. Burkom,et al.  Statistical Challenges Facing Early Outbreak Detection in Biosurveillance , 2010, Technometrics.

[21]  Chris Chatfield,et al.  Holt‐Winters Forecasting: Some Practical Issues , 1988 .

[22]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .