ACOMCD: A multiple cluster detection algorithm based on the spatial scan statistic and ant colony optimization

The spatial scan statistic (SaTScan) has become one of the most popular methods for detecting and evaluating spatial clusters. However, this method can only identify circular or elliptical clusters and is not a good fit for the detection of irregularly shaped clusters. Numerous methods have since been proposed to solve this problem. Nevertheless, if multiple clusters coexist, these methods may not identify the correct situation, because the interference between clusters can easily lead to a tree-like shaped cluster and cause confusion in the results. In this paper, we propose an Ant Colony Optimization based Multiple Cluster Detection (ACOMCD) algorithm, which combines classical SaTScan with the ant colony optimization (ACO) approach. In the initial stage, SaTScan is first used to mark the candidate cluster areas according to the significance of their maximum likelihood evaluations. Then ACO-based scan statistic is carried out separately on these candidate clusters to identify their natural shapes. The algorithm was designed for spatial regional count data only. Comparisons between ACOMCD, SaTScan, GaScan (genetic algorithm-based scan statistic), and FleXScan (flexibly shaped spatial scan statistic) on three kinds of simulated datasets show that ACOMCD performs the best in simultaneously determining the exact number of clusters and identifying multiple irregularly shaped clusters. A case study on esophageal cancer in eastern China further validates the correctness and effectiveness of ACOMCD.

[1]  Toshiro Tango,et al.  A Spatial Scan Statistic with a Restricted Likelihood Ratio , 2008 .

[2]  Ricardo H. C. Takahashi,et al.  A genetic algorithm for irregularly shaped spatial scan statistics , 2007, Comput. Stat. Data Anal..

[3]  Martin Kulldorff,et al.  Power evaluation of disease clustering tests , 2003, International journal of health geographics.

[4]  M Kulldorff,et al.  Spatial disease clusters: detection and inference. , 1995, Statistics in medicine.

[5]  T. Tango,et al.  International Journal of Health Geographics a Flexibly Shaped Spatial Scan Statistic for Detecting Clusters , 2005 .

[6]  G. P. Patil,et al.  Upper level set scan statistic for detecting arbitrarily shaped hotspots , 2004, Environmental and Ecological Statistics.

[7]  Chenghu Zhou,et al.  Detecting arbitrarily shaped clusters using ant colony optimization , 2011, Int. J. Geogr. Inf. Sci..

[8]  André Luiz Fernandes Cançado,et al.  Delineation of Irregularly Shaped Disease Clusters Through Multiobjective Optimization , 2008 .

[9]  Renato Assunção,et al.  A Simulated Annealing Strategy for the Detection of Arbitrarily Shaped Spatial Clusters , 2022 .

[10]  Q. Wei,et al.  Methylenetetrahydrofolate reductase polymorphisms increase risk of esophageal squamous cell carcinoma in a Chinese population. , 2001, Cancer research.

[11]  Marco Dorigo,et al.  Distributed Optimization by Ant Colonies , 1992 .

[12]  Nicolas Molinari,et al.  Arbitrarily shaped multiple spatial cluster detection for case event data , 2007, Comput. Stat. Data Anal..

[13]  M. Kulldorff,et al.  Evaluation of Spatial Scan Statistics for Irregularly Shaped Clusters , 2006 .

[14]  C. S. Yang,et al.  Research on esophageal cancer in China: a review. , 1980, Cancer research.

[15]  Martin Charlton,et al.  A Mark 1 Geographical Analysis Machine for the automated analysis of point data sets , 1987, Int. J. Geogr. Inf. Sci..

[16]  Lionel Cucala,et al.  Computational Statistics and Data Analysis a Flexible Spatial Scan Test for Case Event Data , 2022 .

[17]  Julian Besag,et al.  The Detection of Clusters in Rare Diseases , 1991 .

[18]  Peter J. Park,et al.  Power comparisons for disease clustering tests , 2003, Comput. Stat. Data Anal..

[19]  Martin Kulldorff,et al.  Statistical Methods for Spatial Epidemiology: Tests for Randomness , 1998 .

[20]  J. Naus The Distribution of the Size of the Maximum Cluster of Points on a Line , 1965 .

[21]  M. Kulldorff A spatial scan statistic , 1997 .

[22]  M. Kulldorff,et al.  An elliptic spatial scan statistic , 2006, Statistics in medicine.

[23]  Marco Dorigo,et al.  Optimization, Learning and Natural Algorithms , 1992 .