A hybridized approach to data clustering

Data clustering helps one discern the structure of and simplify the complexity of massive quantities of data. It is a common technique for statistical data analysis and is used in many fields, including machine learning, data mining, pattern recognition, image analysis, and bioinformatics, in which the distribution of information can be of any size and shape. The well-known K-means algorithm, which has been successfully applied to many practical clustering problems, suffers from several drawbacks due to its choice of initializations. A hybrid technique based on combining the K-means algorithm, Nelder-Mead simplex search, and particle swarm optimization, called K-NM-PSO, is proposed in this research. The K-NM-PSO searches for cluster centers of an arbitrary data set as does the K-means algorithm, but it can effectively and efficiently find the global optima. The new K-NM-PSO algorithm is tested on nine data sets, and its performance is compared with those of PSO, NM-PSO, K-PSO and K-means clustering. Results show that K-NM-PSO is both robust and suitable for handling data clustering.

[1]  L. S. Nelson,et al.  The Nelder-Mead Simplex Procedure for Function Minimization , 1975 .

[2]  G. R. Hext,et al.  Sequential Application of Simplex Designs in Optimisation and Evolutionary Operation , 1962 .

[3]  Russell C. Eberhart,et al.  Tracking and optimizing dynamic systems with particle swarms , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[4]  C. A. Murthy,et al.  In search of optimal clusters using genetic algorithms , 1996, Pattern Recognit. Lett..

[5]  Shokri Z. Selim,et al.  K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[7]  Shu-Kai S. Fan,et al.  Hybrid simplex search and particle swarm optimization for the global optimization of multimodal functions , 2004 .

[8]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[9]  James Kennedy,et al.  Particle swarm optimization , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[10]  Jean-Michel Renders,et al.  Hybrid methods using genetic algorithms for global optimization , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[11]  John Yen,et al.  A hybrid approach to modeling metabolic systems using a genetic algorithm and simplex method , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[12]  Sandra Paterlini,et al.  Differential evolution and particle swarm optimisation in partitional clustering , 2006, Comput. Stat. Data Anal..

[13]  Ujjwal Maulik,et al.  An evolutionary technique based on K-Means algorithm for optimal clustering in RN , 2002, Inf. Sci..

[14]  Robert Hooke,et al.  `` Direct Search'' Solution of Numerical and Statistical Problems , 1961, JACM.

[15]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .