Implementation and Comparison of Preprocessing Methods for Biosurveillance Data

Modern biosurveillance relies on multiple sources of both prediagnostic and diagnostic data, updated daily, to discover disease outbreaks. Intrinsic to this effort are two assumptions: (1) the data being analyzed contain early indicators of a disease outbreak and (2) the outbreaks to be detected are not known a priori. However, in addition to outbreak indicators, syndromic data streams include such factors as day-of-week effects, seasonal effects, autocorrelation, and global trends. These explainable factors obscure unexplained outbreak events, and their presence in the data violates standard control-chart assumptions. Monitoring tools such as Shewhart, cumulative sum, and exponentially weighted moving average control charts will alert based largely on these explainable factors instead of on outbreaks. The goal of this paper is 2-fold: first, to describe a set of tools for identifying explainable patterns such as temporal dependence and, second, to survey and examine several data preconditioning methods that significantly reduce these explainable factors, yielding data better suited for monitoring using the popular control charts.

[1]  D. Diers,et al.  The effect of staff nursing on length of stay and mortality. , 1998, Medical care.

[2]  T Vaughn,et al.  A multisite study of nurse staffing and patient occurrences. , 1998, Nursing economic$.

[3]  Andrew W. Moore,et al.  Summary of Biosurveillance-relevant statistical and data mining technologies , 2002 .

[4]  Chris Chatfield,et al.  The Holt-Winters Forecasting Procedure , 1978 .

[5]  Nick Andrews,et al.  Outbreak detection: application to infectious disease surveillance , 2003 .

[6]  Colleen A Bradley,et al.  BioSense: implementation of a National Early Event Detection and Situational Awareness System. , 2005, MMWR supplements.

[7]  J. Rice Mathematical Statistics and Data Analysis , 1988 .

[8]  Nick Andrews,et al.  A Statistical Algorithm for the Early Detection of Outbreaks of Infectious Disease , 1996 .

[9]  McCloskey Jm Nurse staffing and patient outcomes. , 1998 .

[10]  Galit Shmueli,et al.  Early statistical detection of anthrax outbreaks by tracking over-the-counter medication sales , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[11]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[12]  Galit Shmueli,et al.  Automated time series forecasting for biosurveillance , 2007, Statistics in medicine.

[13]  Richard A. Davis,et al.  Time Series: Theory and Methods , 2013 .

[14]  Stephen Gwilym,et al.  Harry Potter casts a spell on accident prone children , 2005, BMJ : British Medical Journal.

[15]  R. Platt,et al.  A generalized linear mixed models approach for detecting incident clusters of disease in small areas, with an application to biological terrorism. , 2004, American journal of epidemiology.

[16]  George E. P. Box,et al.  Statistical Control: By Monitoring and Feedback Adjustment , 1997 .

[17]  R H Riffenburgh,et al.  A simple and general change-point identifier. , 2006, Statistics in medicine.

[18]  H. Burkom,et al.  Role of data aggregation in biosurveillance detection strategies with applications from ESSENCE. , 2004, MMWR supplements.

[19]  Jun Zhang,et al.  Detection of Outbreaks from Time Series Data Using Wavelet Transform , 2003, AMIA.

[20]  C Kovner,et al.  Nurse staffing levels and adverse events following surgery in U.S. hospitals. , 1998, Image--the journal of nursing scholarship.

[21]  D. Havens,et al.  Nursing care quality: comparison of unit-hired, hospital float pool, and agency nurses. , 1996, Journal of nursing care quality.

[22]  Richard A. Davis,et al.  Time Series: Theory and Methods (2nd ed.). , 1992 .

[23]  Julie A. Pavlin,et al.  Code-based Syndromic Surveillance for Influenzalike Illness by International Classification of Diseases, Ninth Revision , 2007, Emerging infectious diseases.

[24]  Martin Kulldorff,et al.  Prospective time periodic geographical disease surveillance using a scan statistic , 2001 .

[25]  Thomas P. Ryan,et al.  Statistical Control by Monitoring and Feedback Adjustment , 1998 .

[26]  W A Reinke,et al.  Applicability of industrial sampling techniques to epidemiologic investigations: examination of an underutilized resource. , 1991, American journal of epidemiology.

[27]  R. Serfling Methods for current statistical analysis of excess pneumonia-influenza deaths. , 1963, Public health reports.

[28]  Tom Burr,et al.  Modeling emergency department visit patterns for infectious disease complaints: results and application to disease surveillance , 2005, BMC Medical Informatics Decis. Mak..

[29]  L. Hutwagner,et al.  The bioterrorism preparedness and response Early Aberration Reporting System (EARS) , 2003, Journal of Urban Health.

[30]  J C Benneyan,et al.  Statistical Quality Control Methods in Infection Control and Hospital Epidemiology, Part I Introduction and Basic Theory , 1998, Infection Control & Hospital Epidemiology.

[31]  L. Archibald,et al.  Patient density, nurse-to-patient ratio and nosocomial infection risk in a pediatric cardiac intensive care unit. , 1997, The Pediatric infectious disease journal.

[32]  W. Tarnow-Mordi,et al.  Hospital mortality in relation to staff workload: a 4-year study in an adult intensive-care unit , 2000, The Lancet.

[33]  Alyson G. Wilson,et al.  Statistical methods in counterterrorism : game theory, modeling, syndromic surveillance, and biometric authentication , 2006 .