Sequential detection of influenza epidemics by the Kolmogorov-Smirnov test

BackgroundInfluenza is a well known and common human respiratory infection, causing significant morbidity and mortality every year. Despite Influenza variability, fast and reliable outbreak detection is required for health resource planning. Clinical health records, as published by the Diagnosticat database in Catalonia, host useful data for probabilistic detection of influenza outbreaks.MethodsThis paper proposes a statistical method to detect influenza epidemic activity. Non-epidemic incidence rates are modeled against the exponential distribution, and the maximum likelihood estimate for the decaying factor λ is calculated. The sequential detection algorithm updates the parameter as new data becomes available. Binary epidemic detection of weekly incidence rates is assessed by Kolmogorov-Smirnov test on the absolute difference between the empirical and the cumulative density function of the estimated exponential distribution with significance level 0 ≤ α ≤ 1.ResultsThe main advantage with respect to other approaches is the adoption of a statistically meaningful test, which provides an indicator of epidemic activity with an associated probability. The detection algorithm was initiated with parameter λ0 = 3.8617 estimated from the training sequence (corresponding to non-epidemic incidence rates of the 2008-2009 influenza season) and sequentially updated. Kolmogorov-Smirnov test detected the following weeks as epidemic for each influenza season: 50−10 (2008-2009 season), 38−50 (2009-2010 season), weeks 50−9 (2010-2011 season) and weeks 3 to 12 for the current 2011-2012 season.ConclusionsReal medical data was used to assess the validity of the approach, as well as to construct a realistic statistical model of weekly influenza incidence rates in non-epidemic periods. For the tested data, the results confirmed the ability of the algorithm to detect the start and the end of epidemic periods. In general, the proposed test could be applied to other data sets to quickly detect influenza outbreaks. The sequential structure of the test makes it suitable for implementation in many platforms at a low computational cost without requiring to store large data sets.

[1]  M. Stephens EDF Statistics for Goodness of Fit and Some Comparisons , 1974 .

[2]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[3]  A. Plasència,et al.  [Surveillance of the pandemic influenza (H1N1) 2009 in Catalonia: results and implications]. , 2011, Revista espanola de salud publica.

[4]  Mir S. Siadaty,et al.  Bmc Medical Informatics and Decision Making Relemed: Sentence-level Search Engine with Relevance Score for the Medline Database of Biomedical Articles , 2007 .

[5]  Antonio López-Quílez,et al.  Bayesian Markov switching models for the early detection of influenza epidemics , 2008, Statistics in medicine.

[6]  O. L. Davies,et al.  Biometrika Tables for Statisticians. Volume 2. , 1955 .

[7]  Matthew Mohebbi,et al.  Assessing Google Flu Trends Performance in the United States during the 2009 Influenza Virus A (H1N1) Pandemic , 2011, PloS one.

[8]  E. S. Pearson Biometrika tables for statisticians , 1967 .

[9]  H. O. Hartley,et al.  Biometrika Tables for Statisticians. Volume 2. , 1955 .

[10]  Antonio López-Quílez,et al.  FluDetWeb: an interactive web-based system for the early detection of the onset of influenza epidemics , 2009, BMC Medical Informatics Decis. Mak..

[11]  L. H. Miller Table of Percentage Points of Kolmogorov Statistics , 1956 .

[12]  Ross Upshur,et al.  Influenza and pneumonia hospitalizations in Ontario: a time-series analysis , 2004, Epidemiology and Infection.

[13]  T. Pumarola,et al.  Vigilancia de la de gripe pandémica (H1N1) 2009 en Cataluña: Resultados e implicaciones , 2011 .

[14]  J. Zarocostas,et al.  World Health Organization declares A (H1N1) influenza pandemic , 2009, BMJ : British Medical Journal.

[15]  Stefan H. Steiner,et al.  Detecting the start of an influenza outbreak using exponentially weighted moving average charts , 2010, BMC Medical Informatics Decis. Mak..

[16]  Mizuki Morita,et al.  Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter , 2011, EMNLP.

[17]  David Bock,et al.  A review and discussion of prospective statistical surveillance in public health , 2003 .

[18]  F. Massey The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[19]  Bruno O. Shubert,et al.  Random variables and stochastic processes , 1979 .

[20]  Camille Pelat,et al.  Online detection and quantification of epidemics , 2007, BMC Medical Informatics Decis. Mak..

[21]  P. Bickel,et al.  Mathematical Statistics: Basic Ideas and Selected Topics , 1977 .

[22]  R. Serfling Methods for current statistical analysis of excess pneumonia-influenza deaths. , 1963, Public health reports.

[23]  Athanasios Papoulis,et al.  Probability, Random Variables and Stochastic Processes , 1965 .

[24]  Nick Andrews,et al.  A Statistical Algorithm for the Early Detection of Outbreaks of Infectious Disease , 1996 .

[25]  H. Saunders,et al.  Probability, Random Variables and Stochastic Processes (2nd Edition) , 1989 .

[26]  Eva Andersson,et al.  Statistical Surveillance of Epidemics: Peak Detection of Influenza in Sweden , 2008, Biometrical journal. Biometrische Zeitschrift.

[27]  Paul R. Cohen,et al.  Very Predictive Ngrams for Space-Limited Probabilistic Models , 2003, IDA.

[28]  Paola Sebastiani,et al.  Automated Detection of Influenza Epidemics with Hidden Markov Models , 2003, IDA.

[29]  I. Miller Probability, Random Variables, and Stochastic Processes , 1966 .