Multiple source spatial cluster detection via multi-criteria analysis

Multiple data sources are essential to provide reliable information regarding the emergence of potential health threats, compared to single source methods. Spatial Scan Statistics have been adapted to analyze multivariate data sources, but only ad hoc procedures have been devised to address the problem of selecting the most likely cluster and computing its significance. In this work, information from multiple data sources of disease surveillance is incorporated to achieve more coherent spatial cluster detection using tools from multi-criteria analysis. The best cluster solutions are found by maximizing two objective functions simultaneously, based on the concept of dominance. To evaluate the statistical significance of solutions, a statistical approach based on the concept of attainment function is used. The multi-criteria approach has several advantages: the representation of the evaluation function for each data source is clear, and does not suffer from an artificial, and possibly confusing mixture with the other data source evaluations; it is possible to attribute, in a rigorous way, the statistical significance of each candidate cluster; and it is possible to analyze and pick-up the best cluster solutions, as given naturally by the non-dominated set. The methodology is illustrated with real datasets.

[1]  J. Naus Clustering of random points in two dimensions , 1965 .

[2]  M. Kulldorff,et al.  Brain cancer mortality in the United States, 1986 to 1995: a geographic analysis. , 2004, Neuro-oncology.

[3]  Gregory F. Cooper,et al.  A multivariate Bayesian scan statistic for early event detection and characterization , 2010, Machine Learning.

[4]  M. Kulldorff,et al.  Multivariate scan statistics for disease surveillance , 2007, Statistics in medicine.

[5]  J Coberly,et al.  Public health monitoring tools for multiple data streams. , 2005, MMWR supplements.

[6]  Mark Voorneveld,et al.  Characterization of Pareto dominance , 2003, Oper. Res. Lett..

[7]  M. Norström,et al.  Analysis of simultaneous space-time clusters of Campylobacter spp. in humans and in broiler flocks using a multiple dataset approach , 2010, International journal of health geographics.

[8]  Carlos M. Fonseca,et al.  Inferential Performance Assessment of Stochastic Optimisers and the Attainment Function , 2001, EMO.

[9]  Martin Kulldorff,et al.  A Spatial Scan Statistic for Survival Data , 2007, Biometrics.

[10]  W. Verstraeten,et al.  Relating increasing hantavirus incidences to the changing climate: the mast connection , 2009, International journal of health geographics.

[11]  Carlos M. Fonseca,et al.  Exploring the Performance of Stochastic Multiobjective Optimisers with the Second-Order Attainment Function , 2005, EMO.

[12]  J. Naus,et al.  A Double-Scan Statistic for Clusters of Two Types of Events , 1997 .

[13]  M. Kulldorff,et al.  International Journal of Health Geographics Open Access a Scan Statistic for Continuous Data Based on the Normal Probability Model , 2022 .

[14]  Gregory F Cooper,et al.  Issues in applied statistics for public health bioterrorism surveillance using multiple data streams: research needs , 2007, Statistics in medicine.

[15]  Daniel B. Neill,et al.  Fast subset scan for spatial pattern detection , 2012 .

[16]  Usa Prevention,et al.  Rapid health response, assessment, and surveillance after a tsunami--Thailand, 2004-2005. , 2005, MMWR. Morbidity and mortality weekly report.

[17]  Ricardo H. C. Takahashi,et al.  Geographic Delineation of Disease Clusters through Multi-Objective Optimization , 2006, GEOINFO.

[18]  Daniel B. Neill,et al.  Fast subset scan for multivariate event detection , 2013, Statistics in medicine.

[19]  H. Burkom Biosurveillance applying scan statistics with multiple, disparate data sources , 2003, Journal of Urban Health.

[20]  Anderson R Duarte,et al.  Penalized likelihood and multi-objective spatial scans for the detection and inference of irregular clusters , 2010, International journal of health geographics.

[21]  Peter J. Park,et al.  Power comparisons for disease clustering tests , 2003, Comput. Stat. Data Anal..

[22]  Carlos M. Fonseca,et al.  On the Computation of the Empirical Attainment Function , 2011, EMO.

[23]  Stephen E. Fienberg,et al.  Current and Potential Statistical Methods for Monitoring Multiple Data Streams for Biosurveillance , 2006 .

[24]  M. Kulldorff A spatial scan statistic , 1997 .

[25]  David Banks,et al.  Bayesian CAR models for syndromic surveillance on multiple data streams: Theory and practice , 2012, Inf. Fusion.