Research Paper: Power to Detect Spatial Disturbances under Different Levels of Geographic Aggregation

OBJECTIVE Spatio and/or temporal surveillance systems are designed to monitor the ongoing appearance of disease cases in space and time, and to detect potential disturbances in either dimension. Patient addresses are sometimes reported at some level of geographic aggregation, for example by ZIP code or census tract. While this aggregation has the advantage of protecting patient privacy, it also risks compromising statistical efficiency. This paper investigated the variation in power to detect a change in the spatial distribution in the presence of spatial aggregation. METHODS The authors generated 400,000 spatial datasets with varying location and spread of simulated spatial disturbances, both on a purely synthetic uniform population, and on a heterogeneous population, representing hospital admissions to three community hospitals in Cape Cod, Massachusetts. The authors evaluated the power of the M-statistic to detect spatial disturbances, comparing the use of exact spatial locations versus twelve different levels of aggregation, where the M-statistic is a comparison of two distributions of interpoint distances between locations. RESULTS When the spread of simulated spatial disturbances was contained to a small portion of the study region or affects a large proportion of the population at risk, power was highest when exact locations were reported. If the spatial disturbance was a more modest signal, the best power was attained at an aggregated level. CONCLUSIONS The precision at which patients' locations are reported has the potential to affect the power of detection significantly.

[1]  Andrew B. Lawson,et al.  Spatial and syndromic surveillance for public health , 2005 .

[2]  T. Tango,et al.  International Journal of Health Geographics a Flexibly Shaped Spatial Scan Statistic for Detecting Clusters , 2005 .

[3]  G. P. Patil,et al.  Upper level set scan statistic for detecting arbitrarily shaped hotspots , 2004, Environmental and Ecological Statistics.

[4]  B. Bell,et al.  An outbreak of hepatitis A associated with green onions. , 2005, The New England journal of medicine.

[5]  L H Cox Protecting confidentiality in small population health and environmental statistics. , 1996, Statistics in medicine.

[6]  Kenneth D Mandl,et al.  Privacy protection versus cluster detection in spatial epidemiology. , 2006, American journal of public health.

[7]  Peter J. Park,et al.  Power comparisons for disease clustering tests , 2003, Comput. Stat. Data Anal..

[8]  J. Cuzick,et al.  Spatial clustering for inhomogeneous populations , 1990 .

[9]  Nicolas Molinari,et al.  Arbitrarily shaped multiple spatial cluster detection for case event data , 2007, Comput. Stat. Data Anal..

[10]  C W Hedberg,et al.  A national outbreak of Salmonella enteritidis infections from ice cream. The Investigation Team. , 1996 .

[11]  M. Kulldorff,et al.  A Space–Time Permutation Scan Statistic for Disease Outbreak Detection , 2005, PLoS medicine.

[12]  Marvin Zelen,et al.  An Analysis of Contaminated Well Water and Health Effects in Woburn, Massachusetts , 1986 .

[13]  Khaled El Emam,et al.  Protecting privacy using k-anonymity. , 2008, Journal of the American Medical Informatics Association : JAMIA.

[14]  B. Bell,et al.  An outbreak of hepatitis A associated with green onions. , 2005, The Journal of infectious diseases.

[15]  Joseph S. Lombardo Disease Surveillance: a Public Health Informatics Approach , 2008 .

[16]  Kenneth D. Mandl,et al.  Real time spatial cluster detection using interpoint distances among precise patient locations , 2005, BMC Medical Informatics Decis. Mak..

[17]  Elena N. Naumova,et al.  The Elderly and Waterborne Cryptosporidium Infection: Gastroenteritis Hospitalizations before and during the 1993 Milwaukee Outbreak , 2003, Emerging infectious diseases.

[18]  M. Hugh-jones,et al.  The Sverdlovsk anthrax outbreak of 1979. , 1994, Science.

[19]  Renato Assunção,et al.  A Simulated Annealing Strategy for the Detection of Arbitrarily Shaped Spatial Clusters , 2022 .

[20]  Marcello Pagano,et al.  The interpoint distance distribution as a descriptor of point patterns, with an application to spatial disease clustering , 2005, Statistics in medicine.

[21]  Julian Padget,et al.  Using software agents to preserve individual health data confidentiality in micro-scale geographical analyses , 2006, J. Biomed. Informatics.

[22]  M. Kulldorff A spatial scan statistic , 1997 .

[23]  Marcello Pagano,et al.  Improving the power of chronic disease surveillance by incorporating residential history , 2011, Statistics in medicine.

[24]  Andrew B. Lawson,et al.  Statistical Methods in Spatial Epidemiology , 2001 .

[25]  Marco Bonetti,et al.  The use of multiple addresses to enhance cluster detection , 2004 .

[26]  M. Kulldorff,et al.  An elliptic spatial scan statistic , 2006, Statistics in medicine.

[27]  Tom Koch,et al.  The Map as Intent: Variations on the Theme of John Snow , 2004, Cartogr. Int. J. Geogr. Inf. Geovisualization.

[28]  M. Kulldorff Tests of Spatial Randomness Adjusted for an Inhomogeneity , 2006 .

[29]  CA Cassa,et al.  A Novel, Context-Sensitive Approach to Anonymizing Spatial Surveillance Data: Impact on Outbreak Detection , 2006 .

[30]  Edward Mills,et al.  HIV in Nepal: Is the Violent Conflict Fuelling the Epidemic? , 2005, PLoS medicine.

[31]  Marcello Pagano,et al.  A Nonparametric Test of Gene Region Heterogeneity Associated With Phenotype , 2002 .

[32]  L A Waller Statistical power and design of focused clustering studies. , 1996, Statistics in medicine.

[33]  Marcello Pagano,et al.  Effect of spatial resolution on cluster detection: a simulation study , 2007, International journal of health geographics.

[34]  N. Minot,et al.  Poverty Mapping with Aggregate Census Data: What is the Loss in Precision? , 2005 .

[35]  G. Rushton,et al.  Geographically masking health data to preserve confidentiality. , 1999, Statistics in medicine.

[36]  J. P. Davis,et al.  A massive outbreak in Milwaukee of cryptosporidium infection transmitted through the public water supply. , 1994, The New England journal of medicine.

[37]  Al Ozonoff,et al.  Bivariate method for spatio-temporal syndromic surveillance. , 2004, MMWR supplements.