A novel measure for data stream anomaly detection in a bio-surveillance system

The primary concern of a bio-surveillance system is to analyze and interpret data as they are collected and then decide whether further investigation is required. Decision makers need to know whether the data in the current test interval are sufficiently different from expected counts to cause an alert. Despite the fact that a number of detection methods have been proposed, we notice in the literature the users of current systems can still experience extremely high false alarm rate. We propose a novel measure that takes into account both anomaly magnitude and anomaly frequencies for bio-surveillance, and experimental results show that the proposed measure performs better than conventional measures for bio-surveillance.

[1]  Joseph Naus,et al.  Temporal surveillance using scan statistics , 2006, Statistics in medicine.

[2]  Galit Shmueli,et al.  Early statistical detection of anthrax outbreaks by tracking over-the-counter medication sales , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Russ P Lopez Disease Surveillance: A Public Health Informatics Approach , 2007 .

[4]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[5]  Al Ozonoff,et al.  Bivariate method for spatio-temporal syndromic surveillance. , 2004, MMWR supplements.

[6]  Marcello Pagano,et al.  The interpoint distance distribution as a descriptor of point patterns, with an application to spatial disease clustering , 2005, Statistics in medicine.

[7]  Ingo Mierswa Automatic Feature Extraction from Large Time Series , 2004, LWA.

[8]  Galit Shmueli,et al.  Automated time series forecasting for biosurveillance , 2007, Statistics in medicine.

[9]  Pierre Geurts,et al.  Pattern Extraction for Time Series Classification , 2001, PKDD.

[10]  Ronald D Fricker,et al.  Comparing syndromic surveillance detection methods: EARS' versus a CUSUM‐based methodology , 2008, Statistics in medicine.

[11]  Tom Burr,et al.  Modeling emergency department visit patterns for infectious disease complaints: results and application to disease surveillance , 2005, BMC Medical Informatics Decis. Mak..

[12]  L. Wasserman,et al.  Operating characteristics and extensions of the false discovery rate procedure , 2002 .

[13]  R. Platt,et al.  A generalized linear mixed models approach for detecting incident clusters of disease in small areas, with an application to biological terrorism. , 2004, American journal of epidemiology.

[14]  M. Kulldorff Spatial Scan Statistics: Models, Calculations, and Applications , 1999 .

[15]  Ryan Hafen,et al.  Generating Synthetic Syndromic-Surveillance Data for Evaluating Visual-Analytics Techniques , 2009, IEEE Computer Graphics and Applications.

[16]  Andrew W. Moore,et al.  Rule-based anomaly pattern detection for detecting disease outbreaks , 2002, AAAI/IAAI.

[17]  James R. Thompson,et al.  Statistical Process Control for Quality Improvement. , 1994 .

[18]  Howard S. Burkom,et al.  Statistical Challenges Facing Early Outbreak Detection in Biosurveillance , 2010, Technometrics.

[19]  Marcello Pagano,et al.  Using temporal context to improve biosurveillance , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Andrew W. Moore,et al.  A Bayesian Spatial Scan Statistic , 2005, NIPS.

[21]  Galit Shmueli,et al.  Wavelet-Based Monitoring in Modern Biosurveillance , 2005 .

[22]  Danny Pfeffermann,et al.  Multivariate exponential smoothing: Method and practice , 1989 .

[23]  I. Tager,et al.  Application of exponential smoothing for nosocomial infection surveillance. , 1996, American journal of epidemiology.

[24]  J Coberly,et al.  Public health monitoring tools for multiple data streams. , 2005, MMWR supplements.

[25]  J. Ord,et al.  A New Look at Models For Exponential Smoothing , 2001 .

[26]  Priscilla S. Wisner Statistical Process Control for Quality Improvement , 2009 .

[27]  W. Cleveland,et al.  Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .

[28]  A. Mccarthy Development , 1996, Current Opinion in Neurobiology.

[29]  Al Ozonoff,et al.  A Spatiotemporal Analysis of Syndromic Data for Biosurveillance , 2006 .

[30]  Stephen E. Fienberg,et al.  Current and Potential Statistical Methods for Monitoring Multiple Data Streams for Biosurveillance , 2006 .

[31]  M. Kulldorff,et al.  Dead Bird Clusters as an Early Warning System for West Nile Virus Activity , 2003, Emerging infectious diseases.

[32]  Gregory F Cooper,et al.  Issues in applied statistics for public health bioterrorism surveillance using multiple data streams: research needs , 2007, Statistics in medicine.

[33]  Tom Burr,et al.  Accounting for seasonal patterns in syndromic surveillance data for outbreak detection , 2006, BMC Medical Informatics Decis. Mak..

[34]  H. Burkom Development, adaptation, and assessment of alerting algorithms for biosurveillance , 2003 .

[35]  T. Perneger What's wrong with Bonferroni adjustments , 1998, BMJ.

[36]  Irma J. Terpenning,et al.  STL : A Seasonal-Trend Decomposition Procedure Based on Loess , 1990 .

[37]  Joseph S. Lombardo,et al.  Comprar Disease Surveillance: A Public Health Informatics Approach | David Buckeridge | 9780470068120 | Wiley , 2007 .

[38]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .