Detecting arbitrarily shaped clusters using ant colony optimization

In the map of geo-referenced population and cases, the detection of the most likely cluster (MLC), which is made up of many connected polygons (e.g., the boundaries of census tracts), may face two difficulties. One is the irregularity of the shape of the cluster and the other is the heterogeneity of the cluster. A heterogeneous cluster is referred to as the cluster containing depression links (a polygon is a depression link if it satisfies two conditions: (1) the ratio between the case number and the population in the polygon is below the average ratio of the whole map; (2) the removal of the polygon will disconnect the cluster). Previous studies have successfully solved the problem of detecting arbitrarily shaped clusters not containing depression links. However, for a heterogeneous cluster, existing methods may generate mistakes, for example, missing some parts of the cluster. In this article, a spatial scanning method based on the ant colony optimization (AntScan) is proposed to improve the detection power. If a polygon can be simplified as a node, the research area consisting of many polygons then can be seen as a graph. So the detection of the MLC can be seen as the search of the best subgraph (with the largest likelihood value) in the graph. The comparison between AntScan, GAScan (the spatial scan method based on the genetic optimization), and SAScan (the spatial scan method based on the simulated annealing optimization) indicates that (1) the performance of GAScan and SAScan is significantly influenced by the parameter of the fraction value (the maximum allowed size of the detected cluster), which can only be estimated by multiple trials, while no such parameter is needed in AntScan; (2) AntScan shows superior power over GAScan and SAScan in detecting heterogeneous clusters. The case study on esophageal cancer in North China demonstrates that the cluster identified by AntScan has the larger likelihood value than that detected by SAScan and covers all high-risk regions of esophageal cancer whereas SAScan misses some high-risk regions (the region in the southwest of Shandong province, eastern China) due to the existence of a depression link.

[1]  S. Chainey,et al.  GIS and Crime Mapping: Chainey/GIS and Crime Mapping , 2005 .

[2]  M. Kulldorff,et al.  An elliptic spatial scan statistic , 2006, Statistics in medicine.

[3]  Ricardo H. C. Takahashi,et al.  Geographic Delineation of Disease Clusters through Multi-Objective Optimization , 2006, GEOINFO.

[4]  T. Tango,et al.  International Journal of Health Geographics a Flexibly Shaped Spatial Scan Statistic for Detecting Clusters , 2005 .

[5]  Ricardo H. C. Takahashi,et al.  A genetic algorithm for irregularly shaped spatial scan statistics , 2007, Comput. Stat. Data Anal..

[6]  S. Chainey,et al.  GIS and Crime Mapping , 2005 .

[7]  Andrew B. Lawson,et al.  Statistical Methods in Spatial Epidemiology , 2001 .

[8]  Marco Dorigo,et al.  Optimization, Learning and Natural Algorithms , 1992 .

[9]  Julian Besag,et al.  The Detection of Clusters in Rare Diseases , 1991 .

[10]  Barbara Webb,et al.  Swarm Intelligence: From Natural to Artificial Systems , 2002, Connect. Sci..

[11]  Peter D Siersema,et al.  Esophageal cancer. , 2008, Gastroenterology clinics of North America.

[12]  Andrew W. Moore,et al.  Anomalous Spatial Cluster Detection , 2005 .

[13]  Jun Luo,et al.  Adjacency Method for Finding Connected Subsets of a Graph: An Application of Graph Theory to Spatial Statistics , 2009, Commun. Stat. Simul. Comput..

[14]  Marco Dorigo,et al.  Ant colony optimization , 2006, IEEE Computational Intelligence Magazine.

[15]  Vijayalakshmi Atluri,et al.  Random Walks to Identify Anomalous Free-Form Spatial Scan Windows , 2008, IEEE Transactions on Knowledge and Data Engineering.

[16]  Mark Gahegan,et al.  A Genetic Approach to Detecting Clusters in Point Data Sets , 2005 .

[17]  Renato Assunção,et al.  A Simulated Annealing Strategy for the Detection of Arbitrarily Shaped Spatial Clusters , 2022 .

[18]  Jun Zhang,et al.  An intelligent testing system embedded with an ant colony optimization based test composition method , 2009, 2009 IEEE Congress on Evolutionary Computation.

[19]  Andrew W. Moore,et al.  Detection of spatial and spatio-temporal clusters , 2006 .

[20]  Manuel López-Ibáñez,et al.  Ant colony optimization , 2010, GECCO '10.

[21]  Thomas Stützle,et al.  Ant colony optimization: artificial ants as a computational intelligence technique , 2006 .

[22]  Toshiro Tango,et al.  International Journal of Health Geographics a Flexibly Shaped Space-time Scan Statistic for Disease Outbreak Detection and Monitoring , 2022 .

[23]  Alex Alves Freitas,et al.  Data mining with an ant colony optimization algorithm , 2002, IEEE Trans. Evol. Comput..

[24]  André Luiz Fernandes Cançado,et al.  Delineation of Irregularly Shaped Disease Clusters Through Multiobjective Optimization , 2008 .

[25]  Monique Snoeck,et al.  Classification With Ant Colony Optimization , 2007, IEEE Transactions on Evolutionary Computation.

[26]  Martin Charlton,et al.  A Mark 1 Geographical Analysis Machine for the automated analysis of point data sets , 1987, Int. J. Geogr. Inf. Sci..