A General Multiobjective Clustering Approach Based on Multiple Distance Measures

Many conventional clustering methods have limitations to partition data sets with different structures. The reason is that the relationship of each pair of data objects in different structures is usually based on different distance measures while conventional clustering methods are often designed for an assumption distribution in Euclidean space. Most of current clustering methods have also been proposed for integrating different distance measures together, however, the weights for different distance measures are difficult to set. To alleviate this case and to generate reliable clustering results for data sets with different structures, in this paper, a novel multiple distance measures clustering method based on a multiobjective evolutionary algorithm is proposed to this problem. This approach takes two types of distance measures as multiple objective functions and optimizes them simultaneously by using a modified multiobjective evolutionary algorithm with some new strategies including initialization, crossover operator, mutation operator, and objective functions designing. Moreover, an updated approach was also proposed for detecting the correct cluster number automatically. The new approaches are applied to many datasets with spherical and irregular structures, and the results of eight artificial, four widely used and four real data sets will be exhibited in experiments. The comparisons with other clustering algorithms show that, no matter what shape dataset has, both of the proposed approaches can get satisfactory results in combining different distance measures and detecting the optimal cluster number in a single run.

[1]  Alex Alves Freitas,et al.  A Survey of Evolutionary Algorithms for Clustering , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[2]  David F. Barrero,et al.  A Genetic Graph-Based Approach for Partitional Clustering , 2014, Int. J. Neural Syst..

[3]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[4]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Joshua D. Knowles,et al.  An Evolutionary Approach to Multiobjective Clustering , 2007, IEEE Transactions on Evolutionary Computation.

[6]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Giuliano Armano,et al.  Multiobjective clustering analysis using particle swarm optimization , 2016, Expert Syst. Appl..

[8]  C. Bong,et al.  Multiobjective clustering with metaheuristic: current trends and methods in image segmentation , 2012 .

[9]  Francesco Masulli,et al.  A survey of kernel and spectral methods for clustering , 2008, Pattern Recognit..

[10]  David F. Barrero,et al.  A Multi-Objective Genetic Graph-Based Clustering algorithm with memory optimization , 2013, 2013 IEEE Congress on Evolutionary Computation.

[11]  Ujjwal Maulik,et al.  Survey of Multiobjective Evolutionary Algorithms for Data Mining: Part II , 2014, IEEE Transactions on Evolutionary Computation.

[12]  David Camacho,et al.  Adaptive k-Means Algorithm for Overlapped Graph Clustering , 2012, Int. J. Neural Syst..

[13]  Hanqiang Liu,et al.  A multiobjective spatial fuzzy clustering algorithm for image segmentation , 2015, Appl. Soft Comput..

[14]  Ujjwal Maulik,et al.  A Survey of Multiobjective Evolutionary Algorithms for Data Mining: Part I , 2014, IEEE Transactions on Evolutionary Computation.

[15]  Yves Lechevallier,et al.  Relational partitioning fuzzy clustering algorithms based on multiple dissimilarity matrices , 2013, Fuzzy Sets Syst..

[16]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[17]  Ujjwal Maulik,et al.  Multiobjective Genetic Clustering for Pixel Classification in Remote Sensing Imagery , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[18]  Ujjwal Maulik,et al.  Fuzzy partitioning using a real-coded variable-length genetic algorithm for pixel classification , 2003, IEEE Trans. Geosci. Remote. Sens..

[19]  Ujjwal Maulik,et al.  A new multi-objective technique for differential fuzzy clustering , 2011, Appl. Soft Comput..

[20]  Alain Bretto,et al.  A reductive approach to hypergraph clustering: An application to image segmentation , 2012, Pattern Recognit..

[21]  Michael K. Ng,et al.  Automated variable weighting in k-means type clustering , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Haiqiao Huang,et al.  A robust adaptive clustering analysis method for automatic identification of clusters , 2012, Pattern Recognit..

[23]  Joachim M. Buhmann,et al.  Path-Based Clustering for Grouping of Smooth Curves and Texture Segmentation , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Johan A. K. Suykens,et al.  Multiway Spectral Clustering with Out-of-Sample Extensions through Weighted Kernel PCA , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[26]  M. Narasimha Murty,et al.  Genetic K-means algorithm , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[27]  Ujjwal Maulik,et al.  Genetic clustering for automatic evolution of clusters and application to image classification , 2002, Pattern Recognit..

[28]  Maoguo Gong,et al.  Image texture classification using a manifold- distance-based evolutionary clustering method , 2008 .

[29]  Amit Konar,et al.  Automatic kernel clustering with a Multi-Elitist Particle Swarm Optimization Algorithm , 2008, Pattern Recognit. Lett..

[30]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[32]  Yves Lechevallier,et al.  Partitioning hard clustering algorithms based on multiple dissimilarity matrices , 2012, Pattern Recognit..