An Effective Clustering Algorithm for Data Mining

This paper proposes an effective clustering algorithm for databases, which are benchmark data sets of data mining applications. We present a Genetic Clustering Algorithm (GCA) that finds a globally optimal partition of a given data sets into a specified number of clusters. The algorithm is distance-based and creates centroids. To evaluate the proposed algorithm, we use some artificial data sets and compare with results of K-means. Experimental results show that the proposed algorithm has better performance and efficiently finds accurate clusters.

[1]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[2]  C. L. Liu,et al.  Introduction to Combinatorial Mathematics. , 1971 .

[3]  Sushil J. Louis,et al.  A recursive clustering methodology using a genetic algorithm , 2007, 2007 IEEE Congress on Evolutionary Computation.

[4]  Lin-Yu Tseng,et al.  A genetic approach to the automatic clustering problem , 2001, Pattern Recognit..

[5]  Hwei-Jen Lin,et al.  An Efficient GA-based Clustering Technique , 2005 .

[6]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[7]  Brian Everitt,et al.  Cluster analysis , 1974 .

[8]  Nelson F. F. Ebecken,et al.  A genetic algorithm for cluster analysis , 2003, Intell. Data Anal..

[9]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[10]  Hae-Sang Park,et al.  A simple and fast algorithm for K-medoids clustering , 2009, Expert Syst. Appl..

[11]  M. Narasimha Murty,et al.  A near-optimal initial seed value selection in K-means means algorithm using a genetic algorithm , 1993, Pattern Recognit. Lett..

[12]  JunChi-Hyuck,et al.  A simple and fast algorithm for K-medoids clustering , 2009 .

[13]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[14]  Chih-Chin Lai,et al.  A Novel Clustering Approach using Hierarchical Genetic Algorithms , 2005, Intell. Autom. Soft Comput..

[15]  Ujjwal Maulik,et al.  An evolutionary technique based on K-Means algorithm for optimal clustering in RN , 2002, Inf. Sci..

[16]  Ricardo J. G. B. Campello,et al.  Improving the Efficiency of a Clustering Genetic Algorithm , 2004, IBERAMIA.