Combining K-Means and K-Harmonic with Fish School Search Algorithm for data clustering task on graphics processing units

Data clustering is related to the split of a set of objects into smaller groups with common features. Several optimization techniques have been proposed to increase the performance of clustering algorithms. Swarm Intelligence (SI) algorithms are concerned with optimization problems and they have been successfully applied to different domains. In this work, a Swarm Clustering Algorithm (SCA) is proposed based on the standard K-Means and on K-Harmonic Means (KHM) clustering algorithms, which are used as fitness functions for a SI algorithm: Fish School Search (FSS). The motivation is to exploit the search capability of SI algorithms and to avoid the major limitation of falling into locally optimal values of the K-Means algorithm. Because of the inherent parallel nature of the SI algorithms, since the fitness function can be evaluated for each individual in an isolated manner, we have developed the parallel implementation on GPU of the SCAs, comparing the performances with their serial implementation. The interest behind proposing SCA is to verify the ability of FSS algorithm to deal with the clustering task and to study the difference of performance of FSS-SCA implemented on CPU and on GPU. Experiments with 13 benchmark datasets have shown similar or slightly better quality of the results compared to standard K-Means algorithm and Particle Swarm Algorithm (PSO) algorithm. There results of using FSS for clustering are promising.

[1]  Tieli Sun,et al.  An efficient hybrid data clustering method based on K-harmonic means and Particle Swarm Optimization , 2009, Expert Syst. Appl..

[2]  Abdolreza Hatamlou,et al.  Black hole: A new heuristic optimization approach for data clustering , 2013, Inf. Sci..

[3]  Dantong Ouyang,et al.  An artificial bee colony approach for clustering , 2010, Expert Syst. Appl..

[4]  Li-Yeh Chuang,et al.  An Improved Particle Swarm Optimization for Data Clustering , 2012 .

[5]  Kamel Nadjet,et al.  A New Algorithm for Data Clustering Based on Cuckoo Search Optimization , 2014, ICGEC 2014.

[6]  Yue-Shan Chang,et al.  A parallel Bees Algorithm implementation on GPU , 2014, J. Syst. Archit..

[7]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[8]  Ali Maroosi,et al.  Application of honey-bee mating optimization algorithm on clustering , 2007, Appl. Math. Comput..

[9]  Peter Wai Ming Tsang,et al.  GPU Accelerated PK-means Algorithm for Gene Clustering , 2011 .

[10]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[11]  Salwani Abdullah,et al.  A combined approach for clustering based on K-means and gravitational search algorithms , 2012, Swarm Evol. Comput..

[12]  V. Mani,et al.  Clustering using firefly algorithm: Performance study , 2011, Swarm Evol. Comput..

[13]  Umeshwar Dayal,et al.  K-Harmonic Means - A Data Clustering Algorithm , 1999 .

[14]  Punam Bedi,et al.  Cuckoo Search Clustering Algorithm: A novel strategy of biomimicry , 2011, 2011 World Congress on Information and Communication Technologies.

[15]  Alireza,et al.  A New Clustering Method Based on Firefly and KHM , 2012 .

[16]  Kevin Cheng,et al.  An ACO-Based Clustering Algorithm , 2006, ANTS Workshop.

[17]  James Bailey,et al.  Information theoretic measures for clusterings comparison: is a correction for chance necessary? , 2009, ICML '09.

[18]  Heyan Huang,et al.  A Novel Spatial Clustering Analysis Method Using Bat Algorithm , 2012 .

[19]  Wojciech Kwedlo,et al.  A clustering method combining differential evolution with the K-means algorithm , 2011, Pattern Recognit. Lett..

[20]  C. J. A. B. Filho,et al.  On the influence of the swimming operators in the Fish School Search algorithm , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[21]  Swee Chuan Tan Simplifying and improving swarm-based clustering , 2012, 2012 IEEE Congress on Evolutionary Computation.

[22]  Xiaohui Yan,et al.  A new approach for data clustering using hybrid artificial bee colony algorithm , 2012, Neurocomputing.

[23]  Arindam Roy,et al.  Application of Particle Swarm Optimization in Data Clustering: A Survey , 2013 .

[24]  Simon Fong,et al.  Integrating nature-inspired optimization algorithms to K-means clustering , 2012, Seventh International Conference on Digital Information Management (ICDIM 2012).

[25]  Glauber Duarte Monteiro,et al.  PSO-GPU: accelerating particle swarm optimization in CUDA-based graphics processing units , 2011, GECCO.

[26]  Ke Ding,et al.  A GPU-based parallel fireworks algorithm for optimization , 2013, GECCO '13.

[27]  Xiangtao Li,et al.  A novel hybrid K-harmonic means and gravitational search algorithm approach for clustering , 2011, Expert Syst. Appl..

[28]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[29]  Yuyan Zheng,et al.  An Improved PSO Clustering Algorithm with Entropy-based Fuzzy Clustering , 2015 .

[30]  Rehab F. Abdel-Kader,et al.  A PSO-BASED SUBTRACTIVE DATA CLUSTERING ALGORITHM , 2013 .

[31]  Ajith Abraham,et al.  Swarm Intelligence Algorithms for Data Clustering , 2008, Soft Computing for Knowledge Discovery and Data Mining.

[32]  Masoud Yaghini,et al.  Tabu-KM: A Hybrid Clustering Algorithm Based on Tabu Search Approach , 2010 .

[33]  Martyn Amos,et al.  Parallelization strategies for ant colony optimisation on GPUs , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[34]  Yongquan Zhou,et al.  Using Glowworm Swarm Optimization Algorithm for Clustering Analysis , 2011 .

[35]  Harpreet Kaur,et al.  A Novel Approach Towards K-Mean Clustering Algorithm With PSO , 2014 .

[36]  Yucheng Kao,et al.  Combining K-means and particle swarm optimization for dynamic data clustering problems , 2009, 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems.

[37]  Zülal Güngör,et al.  K-harmonic means data clustering with simulated annealing heuristic , 2007, Appl. Math. Comput..

[38]  Stefano Cagnoni,et al.  GPU-based asynchronous particle swarm optimization , 2011, GECCO '11.

[39]  Mehdi Sargolzaei,et al.  A New Cooperative Algorithm Based on PSO and K-Means for Data Clustering , 2012 .

[40]  Sanjay Jasola,et al.  A hybrid sequential approach for data clustering using K-Means and particle swarm optimization algorithm , 2011 .

[41]  Hamed Shah-Hosseini Improving K-means clustering algorithm with the intelligent water drops (IWD) algorithm , 2013, Int. J. Data Min. Model. Manag..

[42]  Divakar Singh,et al.  The Improved K-Means with Particle Swarm Optimization , 2013 .

[43]  Yi-Dong Shen,et al.  Cat swarm optimization clustering (KSACSOC): A cat swarm optimization clustering algorithm , 2012 .

[44]  Rajesh Kumar,et al.  A review on particle swarm optimization algorithms and their applications to data clustering , 2011, Artificial Intelligence Review.

[45]  Dervis Karaboga,et al.  A novel clustering approach: Artificial Bee Colony (ABC) algorithm , 2011, Appl. Soft Comput..

[46]  P. Manikandan,et al.  Data Clustering Using Cuckoo Search Algorithm (CSA) , 2012, SocProS.

[48]  R. J. Kuo,et al.  Automatic Clustering Using an Improved Particle Swarm Optimization , 2013 .

[49]  Xiaoyong Liu,et al.  An Effective Clustering Algorithm With Ant Colony , 2010, J. Comput..

[50]  Mohammad Reza Meybodi,et al.  A new hybrid approach for data clustering using firefly algorithm and K-means , 2012, The 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012).

[51]  Qian Kun-mina A parallel ant colony optimization algorithm based on fine-grained model with GPU-accelerated , 2009 .

[52]  Maryam Lashkari,et al.  EXTENDED PSO ALGORITHM FOR IMPROVEMENT PROBLEMS K-MEANS CLUSTERING ALGORITHM , 2014 .

[53]  C. J. A. Bastos-Filho,et al.  An Enhanced Fish School Search Algorithm , 2013, 2013 BRICS Congress on Computational Intelligence and 11th Brazilian Congress on Computational Intelligence.

[54]  Iain A. Stewart,et al.  Candidate Set Parallelization Strategies for Ant Colony Optimization on the GPU , 2013, ICA3PP.

[55]  Benjamin Auffarth,et al.  Clustering by a genetic algorithm with biased mutation operator , 2010, IEEE Congress on Evolutionary Computation.

[56]  Hancer,et al.  [IEEE 2012 IEEE Congress on Evolutionary Computation (CEC) - Brisbane, Australia (2012.06.10-2012.06.15)] 2012 IEEE Congress on Evolutionary Computation - Artificial Bee Colony based image clustering method , 2012 .

[57]  Simon Fong,et al.  Towards Enhancement of Performance of K-Means Clustering Using Nature-Inspired Optimization Algorithms , 2014, TheScientificWorldJournal.

[58]  Ching-Yi Chen,et al.  Alternative KPSO-Clustering Algorithm , 2005 .

[59]  A. M. Natarajan,et al.  A New Approach for Data Clustering Based on PSO with Local Search , 2008, Comput. Inf. Sci..

[60]  Junyan Chen,et al.  HYBRID CLUSTERING ALGORITHM BASED ON PSO WITH THE MULTIDIMENSIONAL ASYNCHRONISM AND STOCHASTIC DISTURBANCE METHOD , 2012 .

[61]  Martyn Amos,et al.  Enhancing GPU parallelism in nature-inspired algorithms , 2012, The Journal of Supercomputing.

[62]  Fernando Buarque de Lima Neto,et al.  A novel search algorithm based on fish school behavior , 2008, 2008 IEEE International Conference on Systems, Man and Cybernetics.

[63]  Stan Matwin,et al.  A review on particle swarm optimization algorithm and its variants to clustering high-dimensional data , 2013, Artificial Intelligence Review.

[64]  Václav Snásel,et al.  Nature-Inspired Meta-Heuristics on Modern GPUs: State of the Art and Brief Survey of Selected Algorithms , 2013, International Journal of Parallel Programming.

[65]  Vinaya Babu,et al.  Integrated PSO and DE for Data Clustering , 2012 .

[66]  Liu Pengfei,et al.  Tailoring Fuzzy C-Means Clustering Algorithm for Big Data Using Random Sampling and Particle Swarm Optimization , 2015 .

[67]  Carmelo J. A. Bastos-Filho,et al.  Analysis of the Performance of the Fish School Search Algorithm Running in Graphic Processing Units , 2012 .

[68]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[69]  V. Mani,et al.  Clustering Using Levy Flight Cuckoo Search , 2012, BIC-TA.

[70]  Milan Tuba,et al.  Parallelization of the Cuckoo Search using CUDA Architecture , 2013 .

[71]  Gillian Dobbie,et al.  Research on particle swarm optimization based clustering: A systematic review of literature and techniques , 2014, Swarm Evol. Comput..

[72]  Miao Yu,et al.  Thermo-acoustic behavior of a swirl stabilized diffusion flame with heterogeneous sensors , 2013 .

[73]  Andries Petrus Engelbrecht,et al.  Data clustering using particle swarm optimization , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[74]  Neveen I. Ghali,et al.  Exponential Particle Swarm Optimization Approach for Improving Data Clustering , 2008 .

[75]  Rajesh Kumar,et al.  A boundary restricted adaptive particle swarm optimization for data clustering , 2013, Int. J. Mach. Learn. Cybern..