A new clustering approach based on Glowworm Swarm Optimization

High-quality clustering techniques are required for the effective analysis of the growing data. Clustering is a common data mining technique used to analyze homogeneous data instance groups based on their specifications. The clustering based nature-inspired optimization algorithms have received much attention as they have the ability to find better solutions for clustering analysis problems. Glowworm Swarm Optimization (GSO) is a recent nature-inspired optimization algorithm that simulates the behavior of the lighting worms. GSO algorithm is useful for a simultaneous search of multiple solutions, having different or equal objective function values. In this paper, a clustering based GSO is proposed (CGSO), where the GSO is adjusted to solve the data clustering problem to locate multiple optimal centroids based on the multimodal search capability of the GSO. The CGSO process ensures that the similarity between the cluster members is maximized and the similarity among members from different clusters is minimized. Furthermore, three special fitness functions are proposed to evaluate the goodness of the GSO individuals in achieving high quality clusters. The proposed algorithm is tested by artificial and real-world data sets. The better performance of our proposed algorithm over four popular clustering algorithms is demonstrated on most data sets. The results reveal that CGSO can efficiently be used for data clustering.

[1]  Teuvo Kohonen,et al.  Learning vector quantization , 1998 .

[2]  Marie-Francine Moens,et al.  Comparing Document Classification Schemes Using K-Means Clustering , 2008, KES.

[3]  Ujjwal Maulik,et al.  Genetic algorithm-based clustering technique , 2000, Pattern Recognit..

[4]  B. Kulkarni,et al.  An ant colony approach for clustering , 2004 .

[5]  Debasish Ghose,et al.  Detection of multiple source locations using a glowworm metaphor with applications to collective robotics , 2005, Proceedings 2005 IEEE Swarm Intelligence Symposium, 2005. SIS 2005..

[6]  George Karypis,et al.  Hierarchical Clustering Algorithms for Document Datasets , 2005, Data Mining and Knowledge Discovery.

[7]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[8]  David B. Shmoys,et al.  A Best Possible Heuristic for the k-Center Problem , 1985, Math. Oper. Res..

[9]  Palma Blonda,et al.  A survey of fuzzy clustering algorithms for pattern recognition. I , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[10]  George Karypis,et al.  Evaluation of hierarchical clustering algorithms for document datasets , 2002, CIKM '02.

[11]  Thomas E. Potok,et al.  Document clustering using particle swarm optimization , 2005, Proceedings 2005 IEEE Swarm Intelligence Symposium, 2005. SIS 2005..

[12]  Debasish Ghose,et al.  Glowworm swarm optimisation: a new method for optimising multi-modal functions , 2009, Int. J. Comput. Intell. Stud..

[13]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[14]  Andries Petrus Engelbrecht,et al.  Particle swarm optimization method for image clustering , 2005, Int. J. Pattern Recognit. Artif. Intell..

[15]  Marco Dorigo,et al.  Strategies for the Increased Robustness of Ant-Based Clustering , 2003, Engineering Self-Organising Systems.

[16]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[17]  Dimitris K. Tasoulis,et al.  Unsupervised Clustering of Bioinformatics Data , 2004 .

[18]  A. Engelbrecht Computational Intelligence: An Introduction, Second Edition , 2007 .

[19]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[20]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[21]  Debasish Ghose,et al.  Glowworm Swarm Optimization Algorithm for Hazard Sensing in Ubiquitous Environments Using Heterogeneous Agent Swarms , 2008, Soft Computing Applications in Industry.

[22]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .