Automatic Data Clustering by Genetic Algorithm Validated by Fuzzy Intercluster Hostility Index

One must have a prior knowledge about the optimal number of clusters in a data set before clustering. Without having information regarding the exact nature of the underlying data distribution, the determination of optimal number of clusters in an unlabeled data set is not an easy task. Genetic algorithms (GAs) is known as a randomized search and optimization technique guided by the principles of evolution and natural genetics and efficient enough to handle this type of problems. An application of GA to the automatic clustering of the large unlabeled multidimensional data sets is narrated in this article. A fuzzy intercluster hostility index is proposed in this GA based clustering algorithm and employed to determine the optimal number of clusters from unlabeled multidimensional data sets. Comparative studies with the Automatic Clustering Differential Evolution (ACDE) algorithm shows superior result when these two algorithms are applied on two well-known real-life multidimensional data sets.

[1]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[2]  Mukesh M. Raghuwanshi,et al.  Genetic Algorithm Based Clustering: A Survey , 2008, 2008 First International Conference on Emerging Trends in Engineering and Technology.

[3]  Swagatam Das,et al.  Automatic Clustering Using an Improved Differential Evolution Algorithm , 2007 .

[4]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[5]  Wojciech Kwedlo,et al.  A clustering method combining differential evolution with the K-means algorithm , 2011, Pattern Recognit. Lett..

[6]  J. Bezdek Numerical taxonomy with fuzzy sets , 1974 .

[7]  Siddhartha Bhattacharyya,et al.  A differential evolution algorithm based automatic determination of optimal number of clusters validated by fuzzy intercluster hostility index , 2009, 2009 First International Conference on Advanced Computing.

[8]  Wenyin Gong,et al.  A Point Symmetry-Based Automatic Clustering Approach Using Differential Evolution , 2009, ISICA.

[9]  Michael J. Laszlo,et al.  A genetic algorithm that exchanges neighboring centers for k-means clustering , 2007, Pattern Recognit. Lett..

[10]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  M. Narasimha Murty,et al.  Genetic K-means algorithm , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[12]  Esa Alhoniemi,et al.  Clustering of the self-organizing map , 2000, IEEE Trans. Neural Networks Learn. Syst..

[13]  Ujjwal Maulik,et al.  Genetic algorithm-based clustering technique , 2000, Pattern Recognit..

[14]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[15]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Xindong Wu,et al.  Automatic clustering using genetic algorithms , 2011, Appl. Math. Comput..

[17]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Amit Konar,et al.  Computational Intelligence: Principles, Techniques and Applications , 2005 .

[19]  Jan Peters,et al.  Computational Intelligence: Principles, Techniques and Applications , 2007, Comput. J..

[20]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[21]  M.-C. Su,et al.  A new cluster validity measure and its application to image compression , 2004, Pattern Analysis and Applications.

[22]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[23]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[24]  Benjamin Auffarth,et al.  Clustering by a genetic algorithm with biased mutation operator , 2010, IEEE Congress on Evolutionary Computation.

[25]  Paul Scheunders,et al.  A genetic c-Means clustering algorithm applied to color image quantization , 1997, Pattern Recognit..

[26]  Siddhartha Bhattacharyya,et al.  Unsupervised Genetic Algorithm Based Automatic Image Segmentation and Data Clustering Technique Validated by Fuzzy Intercluster Hostility Index , 2016 .

[27]  Alan C. Bovik,et al.  Automatic segmentation of dermoscopy images using self-generating neural networks seeded by genetic algorithm , 2013, Pattern Recognit..

[28]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[29]  Christophe Rosenberger,et al.  Unsupervised clustering method with optimal estimation of the number of clusters: application to image segmentation , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[30]  Sandra Paterlini,et al.  Differential evolution and particle swarm optimisation in partitional clustering , 2006, Comput. Stat. Data Anal..