An efficient hybrid data clustering method based on Candidate Group Search and genetic algorithm

Data Mining is an efficient data analysis process which is used to find the patterns and relationship of a large database. Clustering is a popular technique of data mining for unsupervised learning in which labels are not defined previously. K-Mean is a well known partitioning technique for forming different clusters, but it has the drawback of initial sensitivity and local optima convergence. K-Harmonic algorithm solves the initial sensitivity problem, but it stuck in local optima problem. Genetic algorithm is an efficient tool of the search and optimization problems, which offers the benefits like selective search. In this paper, presents a new scheme in which the initial centroids are calculated using the Candidate Group Search which results in reduction of time for genetic process. Genetic algorithm is used to assign the data elements to the suitable cluster. The experimental results showed that the proposed scheme automatically finds the cluster centers and reaches to global optimal solution for data clustering.

[1]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[2]  Wei-Ning Yang,et al.  Candidate groups search for K-harmonic means data clustering , 2013 .

[3]  Salwani Abdullah,et al.  A combined approach for clustering based on K-means and gravitational search algorithms , 2012, Swarm Evol. Comput..

[4]  Jian Ma,et al.  A hybrid grouping genetic algorithm for reviewer group construction problem , 2011, Expert Syst. Appl..

[5]  Jun Yu,et al.  Genetic algorithm for spanning tree construction in P2P distributed interactive applications , 2014, Neurocomputing.

[6]  Shubha Singh,et al.  A Survey of Clustering Techniques , 2010 .

[7]  D. Napoleon,et al.  An Enhanced k-means algorithm to improve the Efficiency Using Normal Distribution Data Points , 2010 .

[8]  Lokesh Kumar Sharma,et al.  Genetic K-Means Clustering Algorithm for Mixed Numeric and Categorical Data Sets , 2010 .

[9]  G. Sahoo,et al.  A Hybrid Data Clustering Approach Based on Cat Swarm Optimization and K-Harmonic Mean Algorithm , 2014 .

[10]  Suvarna P. Patil A Novel Hybrid Candidate Group Search Genetic Clustering for Large Scale Data , 2015 .

[11]  Anuradha D. Thakare,et al.  Introducing Hybrid Model for Data Clustering using K-Harmonic Means and Gravitational Search Algorithms , 2014 .

[12]  Hasan Al-Shalabi,et al.  Enhancing K-Means Algorithm with Initial Cluster Centers Derived from Data Partitioning along the Data Axis with PCA , 2013 .

[13]  Umeshwar Dayal,et al.  K-Harmonic Means - A Data Clustering Algorithm , 1999 .

[14]  Habiba Drias,et al.  A hybrid genetic algorithm for large scale information retrieval , 2009, 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems.

[15]  Zülal Güngör,et al.  K-Harmonic means data clustering with tabu-search method , 2008 .

[16]  P. Kavitha,et al.  An Efficient Enhanced K-Means Approach with Improved Initial Cluster Centers , 2014 .

[17]  Gadadhar Sahoo,et al.  A hybrid data clustering approach based on improved cat swarm optimization and K-harmonic mean algorithm , 2015, AI Commun..

[18]  Jing Li,et al.  Ant clustering algorithm with K-harmonic means clustering , 2010, Expert Syst. Appl..

[19]  Maurice K. Wong,et al.  Algorithm AS136: A k-means clustering algorithm. , 1979 .