AN EFFICIENT HYBRID APPROACH FOR DATA CLUSTERING USING DYNAMIC K-MEANS ALGORITHM AND FIREFLY ALGORITHM

Clustering is an important task in data mining to group data into meaningful subsets to retrieve information from a given dataset. Clustering is also known as unsupervised learning since the data objects are pointed to a collection of clusters which can be interpreted as classes additionally. The proposed approach concentrates on the K-means algorithm for enhancing the cluster quality and for fixing the optimal number of cluster. Numerous clusters (K) are taken as input. Firefly algorithm is mainly used for solving optimization problems. The proposed approach uses dynamic K-means algorithm is used for dynamic data clustering approaches. It can be applied to both known number of clusters as well as unknown number of clusters. Hence, the user can either fix the number of clusters or they can fix the minimum number of required clusters. If the number of clusters is static, it works like K-means algorithm. If the number of clusters is dynamic, then this algorithm determines the new cluster centers by adding one to the cluster counter in each iteration until the required cluster quality is achieved. The proposed method uses Modified Firefly algorithm to determine the centroid of the user specified number of clusters. This algorithm can be extended using dynamic k-means clustering to enhance centroids and clusters. Thus the proposed Dynamic clustering method increases the cluster quality and modified firefly algorithm increases optimality for the iris and wine datasets. Experimental results proved that the proposed methodology attains maximum cluster quality within a limited time and achieves better optimality.

[1]  Hua Lu,et al.  Ranking Spatial Data by Quality Preferences , 2011, IEEE Transactions on Knowledge and Data Engineering.

[2]  Xin-She Yang,et al.  Firefly algorithm, stochastic test functions and design optimisation , 2010, Int. J. Bio Inspired Comput..

[3]  Douglas H. Fisher,et al.  Conceptual Clustering, Learning from Examples, and Inference , 1987 .

[4]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[5]  Jorge S. Marques,et al.  A Method for Dynamic Clustering of Data , 1998, BMVC.

[6]  Mohammad Reza Meybodi,et al.  A new hybrid approach for data clustering using firefly algorithm and K-means , 2012, The 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012).

[7]  Dima Damen,et al.  British Machine Vision Conference (BMVC) , 2007 .

[8]  Ahamed B M Shafeeq,et al.  Dynamic Clustering of Data with Modified K-Means Algorithm , 2012 .

[9]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[10]  Christos Bouras,et al.  Clustering User Preferences Using W-kmeans , 2011, 2011 Seventh International Conference on Signal Image Technology & Internet-Based Systems.

[11]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[12]  Brian Everitt,et al.  Cluster analysis , 1974 .

[13]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[14]  Chandra.E,et al.  A Survey on Clustering Algorithms for Data in Spatial Database Management Systems , 2011 .

[15]  Dr. Chandra,et al.  A Survey on Clustering Algorithms for Data in Spatial Database Management Systems , 2011 .