A Point Symmetry-Based Automatic Clustering Approach Using Differential Evolution

Clustering is a core problem in data mining and machine learning though it is widely applied in many fields. Recently, it is very popular to use the evolutionary algorithm to solve the problem. This paper proposes an automatic clustering differential evolution (DE) technique for the problem. This approach can be characterized by (i) proposing a modified point symmetry-based cluster validity index (CVI) as a measure of the validity of the corresponding partitioning, (ii) using the Kd-tree based nearest neighbor search to reduce the complexity of finding the closest symmetric point, and (iii) employing a new representation to represent an individual. Experiments conducted on 6 artificial data sets of diverse complexities indicate that this approach is suitable for both the symmetrical intra-clusters and the symmetrical inter-clusters. In addition, it is able to find the optimal number of clusters of the data. Furthermore, based on the comparison with the original point symmetry-based CVI, this proposed point symmetry-based CVI shows better performance in terms of the F-measure and the number of clusters found.

[1]  Yee Leung,et al.  Clustering by Scale-Space Filtering , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Sanghamitra Bandyopadhyay,et al.  A Point Symmetry-Based Clustering Technique for Automatic Evolution of Clusters , 2008, IEEE Transactions on Knowledge and Data Engineering.

[3]  R. Storn,et al.  Differential Evolution: A Practical Approach to Global Optimization (Natural Computing Series) , 2005 .

[4]  Weiguo Sheng,et al.  A weighted sum validity function for clustering with a hybrid niching genetic algorithm , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[5]  Sandra Paterlini,et al.  High performance clustering with differential evolution , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[6]  Sandra Paterlini,et al.  Differential evolution and particle swarm optimisation in partitional clustering , 2006, Comput. Stat. Data Anal..

[7]  Ujjwal Maulik,et al.  Genetic clustering for automatic evolution of clusters and application to image classification , 2002, Pattern Recognit..

[8]  Uday K. Chakraborty,et al.  Advances in Differential Evolution , 2010 .

[9]  Bilal Alatas,et al.  MODENAR: Multi-objective differential evolution algorithm for mining numeric association rules , 2008, Appl. Soft Comput..

[10]  Hichem Frigui,et al.  A Robust Competitive Clustering Algorithm With Applications in Computer Vision , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Swagatam Das,et al.  Automatic Clustering Using an Improved Differential Evolution Algorithm , 2007 .

[12]  Ujjwal Maulik,et al.  Validity index for crisp and fuzzy clusters , 2004, Pattern Recognit..

[13]  Chien-Hsing Chou,et al.  Short Papers , 2001 .

[14]  Sunil Arya,et al.  ANN: library for approximate nearest neighbor searching , 1998 .

[15]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[16]  Kuo-Liang Chung,et al.  Faster and more robust point symmetry-based K-means algorithm , 2007, Pattern Recognit..

[17]  Sanghamitra Bandyopadhyay,et al.  GAPS: A clustering method using a new point symmetry-based distance measure , 2007, Pattern Recognit..

[18]  Andries Petrus Engelbrecht,et al.  An overview of clustering methods , 2007, Intell. Data Anal..

[19]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.