A Hybrid Data Clustering Approach Based on Hydrologic Cycle Optimization and K-means

K-means is a popular and simple clustering method by grouping data into predefined K clusters efficiently. However, K-means performs poorly in the presence of poor centers and tends to converge prematurely. Hydrologic Cycle Optimization, as a novel algorithm inspired by the natural phenomena, has a good ability to search for the global optimal solutions. To overcome drawbacks associated with the K-means and find better initial centroids, in this study, a hybrid clustering algorithm based on Hydrologic Cycle Optimization and K-means (abbreviated as HCO+K-means) is proposed. The proposed algorithm includes two modules: HCO module and K-means module. It executes HCO module firstly to find the best individual with optimal fitness value. While the position of the best individual is then considered as initial set of centers for K-means module to search for a higher quality clustering solution. For comparison purpose, the K-means, PSO+K-means, WCA+K-means and HCO+K-means algorithms are chosen to evaluate on six different datasets. The experimental results indicate that the proposed HCO+K-means algorithm has a strong global search ability and obtains better clustering results in comparison to the other clustering methods.

[1]  Xiaohui Yan,et al.  Hydrologic Cycle Optimization Part I: Background and Theory , 2018, ICSI.

[2]  Huan Liu,et al.  Hydrologic Cycle Optimization Part II: Experiments and Real-World Application , 2018, ICSI.

[3]  Taher Niknam,et al.  An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis , 2010, Appl. Soft Comput..

[4]  Haiyang Li,et al.  Dynamic particle swarm optimization and K-means clustering algorithm for image segmentation , 2015 .

[5]  Ardeshir Bahreininejad,et al.  Water cycle algorithm - A novel metaheuristic optimization method for solving constrained engineering optimization problems , 2012 .

[6]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[7]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[8]  Mohsen Guizani,et al.  Adaptive clustering in wireless sensor networks by mining sensor energy data , 2007, Comput. Commun..

[9]  Wojciech Kwedlo,et al.  A clustering method combining differential evolution with the K-means algorithm , 2011, Pattern Recognit. Lett..

[10]  Ajith Abraham,et al.  Swarm Intelligence Algorithms for Data Clustering , 2008, Soft Computing for Knowledge Discovery and Data Mining.

[11]  R. J. Kuo,et al.  An application of particle swarm optimization algorithm to clustering analysis , 2011, Soft Comput..

[12]  D. Pollard A Central Limit Theorem for $k$-Means Clustering , 1982 .

[13]  Krzysztof J. Cios,et al.  GAKREM: A novel hybrid clustering algorithm , 2008, Inf. Sci..

[14]  George D. C. Cavalcanti,et al.  Semi-supervised clustering for MR brain image segmentation , 2014, Expert Syst. Appl..

[15]  Michael J. Laszlo,et al.  A genetic algorithm that exchanges neighboring centers for k-means clustering , 2007, Pattern Recognit. Lett..