Automatic Clustering Using Multi-objective Particle Swarm and Simulated Annealing

This paper puts forward a new automatic clustering algorithm based on Multi-Objective Particle Swarm Optimization and Simulated Annealing, “MOPSOSA”. The proposed algorithm is capable of automatic clustering which is appropriate for partitioning datasets to a suitable number of clusters. MOPSOSA combines the features of the multi-objective based particle swarm optimization (PSO) and the Multi-Objective Simulated Annealing (MOSA). Three cluster validity indices were optimized simultaneously to establish the suitable number of clusters and the appropriate clustering for a dataset. The first cluster validity index is centred on Euclidean distance, the second on the point symmetry distance, and the last cluster validity index is based on short distance. A number of algorithms have been compared with the MOPSOSA algorithm in resolving clustering problems by determining the actual number of clusters and optimal clustering. Computational experiments were carried out to study fourteen artificial and five real life datasets.

[1]  Joshua D. Knowles,et al.  An Evolutionary Approach to Multiobjective Clustering , 2007, IEEE Transactions on Evolutionary Computation.

[2]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[3]  Saeed Jalili,et al.  Dynamic clustering using combinatorial particle swarm optimization , 2012, Applied Intelligence.

[4]  David E. Goldberg,et al.  Genetic Algorithms with Sharing for Multimodalfunction Optimization , 1987, ICGA.

[5]  R. K. Ursem Multi-objective Optimization using Evolutionary Algorithms , 2009 .

[6]  Sanghamitra Bandyopadhyay,et al.  A Point Symmetry-Based Clustering Technique for Automatic Evolution of Clusters , 2008, IEEE Transactions on Knowledge and Data Engineering.

[7]  Cheng-Chien Kuo,et al.  Modified particle swarm optimization algorithm with simulated annealing behavior and its numerical verification , 2011, Appl. Math. Comput..

[8]  Sanghamitra Bandyopadhyay,et al.  Some connectivity based cluster validity indices , 2012, Appl. Soft Comput..

[9]  Richard D Braatz,et al.  Control Systems Engineering in Continuous Pharmaceutical Manufacturing May 20-21, 2014 Continuous Manufacturing Symposium. , 2015, Journal of pharmaceutical sciences.

[10]  Sanghamitra Bandyopadhyay,et al.  GAPS: A clustering method using a new point symmetry-based distance measure , 2007, Pattern Recognit..

[11]  Ujjwal Maulik,et al.  Genetic clustering for automatic evolution of clusters and application to image classification , 2002, Pattern Recognit..

[12]  Xindong Wu,et al.  Automatic clustering using genetic algorithms , 2011, Appl. Math. Comput..

[13]  Tomoyuki Hiroyasu,et al.  Multiobjective clustering with automatic k-determination for large-scale data , 2007, GECCO '07.

[14]  Sriparna Saha,et al.  A generalized automatic clustering algorithm in a multiobjective framework , 2013, Appl. Soft Comput..

[15]  Sanghamitra Bandyopadhyay,et al.  Classification and learning using genetic algorithms - applications in bioinformatics and web intelligence , 2007, Natural computing series.

[16]  Sushmita Mitra,et al.  Fuzzy Versions of Kohonen's Net and MLP-Based Classification: Performance Evaluation for Certain Nonconvex Decision Regions , 1994, Inf. Sci..

[17]  Michalis Vazirgiannis,et al.  Clustering validity assessment: finding the optimal partitioning of a data set , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[18]  Ajith Abraham,et al.  Multi-Objective Differential Evolution for Automatic Clustering with Application to Micro-Array Data Analysis , 2009, Sensors.

[19]  Sanghamitra Bandyopadhyay,et al.  A new multiobjective simulated annealing based clustering technique using symmetry , 2009, Pattern Recognit. Lett..

[20]  B. Everitt,et al.  Cluster Analysis: Low Temperatures and Voting in Congress , 2001 .

[21]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[22]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .

[23]  Pan Ruo-yu,et al.  Optimization Study on k Value of K-means Algorithm , 2006 .

[24]  Benjamin C. M. Fung,et al.  Hierarchical Document Clustering using Frequent Itemsets , 2003, SDM.

[25]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Yimin Liu,et al.  Integrating Multi-Objective Genetic Algorithm and Validity Analysis for Locating and Ranking Alternative Clustering , 2005, Informatica.

[27]  Jonathan E. Rowe,et al.  Particle swarm optimization and fitness sharing to solve multi-objective optimization problems , 2005, 2005 IEEE Congress on Evolutionary Computation.

[28]  Sanghamitra Bandyopadhyay,et al.  A symmetry based multiobjective clustering technique for automatic evolution of clusters , 2010, Pattern Recognit..

[29]  D. Mitra,et al.  Convergence and finite-time behavior of simulated annealing , 1985, 1985 24th IEEE Conference on Decision and Control.

[30]  Godfried T. Toussaint,et al.  The relative neighbourhood graph of a finite planar set , 1980, Pattern Recognit..

[31]  Alex Alves Freitas,et al.  A Survey of Evolutionary Algorithms for Clustering , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).