BackgroundIn geographic surveillance of disease, areas with large numbers of disease cases are to be identified so that investigations of the causes of high disease rates can be pursued. Areas with high rates are called disease clusters and statistical cluster detection tests are used to identify geographic areas with higher disease rates than expected by chance alone. Typically cluster detection tests are applied to incident or prevalent cases of disease, but surveillance of disease-related events, where an individual may have multiple events, may also be of interest. Previously, a compound Poisson approach that detects clusters of events by testing individual areas that may be combined with their neighbours has been proposed. However, the relevant probabilities from the compound Poisson distribution are obtained from a recursion relation that can be cumbersome if the number of events are large or analyses by strata are performed. We propose a simpler approach that uses an approximate normal distribution. This method is very easy to implement and is applicable to situations where the population sizes are large and the population distribution by important strata may differ by area. We demonstrate the approach on pediatric self-inflicted injury presentations to emergency departments and compare the results for probabilities based on the recursion and the normal approach. We also implement a Monte Carlo simulation to study the performance of the proposed approach.ResultsIn a self-inflicted injury data example, the normal approach identifies twelve out of thirteen of the same clusters as the compound Poisson approach, noting that the compound Poisson method detects twelve significant clusters in total. Through simulation studies, the normal approach well approximates the compound Poisson approach for a variety of different population sizes and case and event thresholds.ConclusionA drawback of the compound Poisson approach is that the relevant probabilities must be determined through a recursion relation and such calculations can be computationally intensive if the cluster size is relatively large or if analyses are conducted with strata variables. On the other hand, the normal approach is very flexible, easily implemented, and hence, more appealing for users. Moreover, the concepts may be more easily conveyed to non-statisticians interested in understanding the methodology associated with cluster detection test results.
[1]
A. Craft,et al.
INVESTIGATION OF LEUKAEMIA CLUSTERS BY USE OF A GEOGRAPHICAL ANALYSIS MACHINE
,
1988,
The Lancet.
[2]
L. Waller,et al.
Applied Spatial Statistics for Public Health Data
,
2004
.
[3]
P. Diggle.
Applied Spatial Statistics for Public Health Data
,
2005
.
[4]
F. Yates.
Contingency Tables Involving Small Numbers and the χ2 Test
,
1934
.
[5]
Rosemary J. Day,et al.
Disease Mapping and Risk Assessment for Public Health
,
1999
.
[6]
B. Turnbull,et al.
Monitoring for clusters of disease: application to leukemia incidence in upstate New York.
,
1990,
American journal of epidemiology.
[7]
R. Rosychuk,et al.
Spatial Event Cluster Detection Using a Compound Poisson Distribution
,
2006,
Biometrics.
[8]
Renato Assunção,et al.
A Simulated Annealing Strategy for the Detection of Arbitrarily Shaped Spatial Clusters
,
2022
.
[9]
L. Waller,et al.
Applied Spatial Statistics for Public Health Data: Waller/Applied Spatial Statistics
,
2004
.
[10]
Julian Besag,et al.
The Detection of Clusters in Rare Diseases
,
1991
.
[11]
M Kulldorff,et al.
Spatial disease clusters: detection and inference.
,
1995,
Statistics in medicine.
[12]
Sheldon M. Ross.
Introduction to Probability Models.
,
1995
.
[13]
N D Le,et al.
Surveillance of clustering near point sources.
,
1996,
Statistics in medicine.
[14]
Harry H. Panjer,et al.
Insurance Risk Models
,
1992
.
[15]
T Tango,et al.
A class of tests for detecting 'general' and 'focused' clustering of rare diseases.
,
1995,
Statistics in medicine.