REFINEMENT OF CLUSTERS FROM K-MEANS WITH ANT COLONY OPTIMIZATION

Clustering is a distribution of data into groups of similar objects. In this paper, Ant Colony Optimization (ACO) is proposed to improve k-means clustering. Though the k-means is one of the best clustering algorithm, the quality is based on the starting condition and it may converge to local minima. And an important point is, so far, the researchers have not contributed to improve the cluster quality after grouping. Our proposed method has two phases: in the first step, the initial seeds for k-means clustering are selected based on statistical modes to converge to a “better” local minimum. And in the second step, we have proposed a novel method to improve the cluster quality by ant based refinement algorithm. The proposed algorithm is tested in medical domain and shows that refined initial starting points and post processing refinement of clusters indeed lead to improved solutions.

[1]  Gerald Kowalski,et al.  Information Retrieval Systems: Theory and Implementation , 1997 .

[2]  Julia Handl,et al.  Improved Ant-Based Clustering and Sorting , 2002, PPSN.

[3]  Pascale Kuntz,et al.  A Stochastic Heuristic for Visualising Graph Clusters in a Bi-Dimensional Space Prior to Partitioning , 1999, J. Heuristics.

[4]  Marco Dorigo,et al.  On the Performance of Ant-based Clustering , 2003, HIS.

[5]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[6]  Juan Julián Merelo Guervós,et al.  Self-Organized Stigmergic Document Maps: Environment as a Mechanism for Context Learning , 2004, ArXiv.

[7]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[8]  Robert M. Gray,et al.  An Improvement of the Minimum Distortion Encoding Algorithm for Vector Quantization , 1985, IEEE Trans. Commun..

[9]  Pascale Kuntz,et al.  Emergent colonization and graph partitioning , 1994 .

[10]  Hanan Samet,et al.  Distance browsing in spatial databases , 1999, TODS.

[11]  Robert F. Sproull,et al.  Refinements to nearest-neighbor searching ink-dimensional trees , 1991, Algorithmica.

[12]  Shaobin Huang,et al.  Combining Multiple Clustering Methods Based on Core Group , 2006, SKG.

[13]  Frank Nielsen,et al.  On weighting clustering , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Chinatsu Aone,et al.  Fast and effective text mining using linear-time document clustering , 1999, KDD '99.

[15]  Jean-Louis Deneubourg,et al.  The dynamics of collective sorting robot-like ants and ant-like robots , 1991 .

[16]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[17]  Charles Elkan,et al.  Using the Triangle Inequality to Accelerate k-Means , 2003, ICML.

[18]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[19]  Weng-Kin Lai,et al.  Homogeneous Ants for Web Document Similarity Modeling and Categorization , 2002, Ant Algorithms.

[20]  Christos Faloutsos,et al.  Analysis of Range Queries and Self-Spatial Join Queries on Real Region Datasets Stored Using an R-Tree , 2000, IEEE Trans. Knowl. Data Eng..

[21]  Baldo Faieta,et al.  Diversity and adaptation in populations of clustering ants , 1994 .

[22]  Baldo Faieta,et al.  Exploratory database analysis via self-organization , 1994 .

[23]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[24]  D. Snyers,et al.  New results on an ant-based heuristic for highlighting the organization of large graphs , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).