SCA2: Novel Efficient Swarm Clustering Algorithm

Clustering is a classical unsupervised learning task that aims to reveal data similarity patterns. Numerous algorithms have been proposed to address this task from different aspects. In the field of swarm intelligence and evolutionary algorithms, most existing algorithms strive to identify a set of cluster centers. However, it is difficult for centroid-based algorithms to process data with clusters of arbitrary shapes. Thus, a clustering algorithm named Swarm Clustering Algorithm (SCA) was proposed to cluster data from a novel aspect, which regards each point in the dataset as a particle, and particles fly towards denser areas to form clusters automatically. In this article, a novel efficient swarm clustering algorithm named SCA2 is proposed, which extends SCA in terms of three aspects: (1) the radial basis function network is adopted as the surrogate model to reduce the time complexity; (2) there are $k$ leaders for each particle, and the particle may follow one of them to decrease misleading; and (3) a simplified strategy is used to update the position of each particle. The performance of SCA2 on different types of synthetic and real-world datasets was compared with the performance of four classical algorithms, SCA as well as a PSO-based clustering algorithm. The experimental results demonstrate that SCA2 is more competitive.

[1]  Joshua D. Knowles,et al.  An Evolutionary Approach to Multiobjective Clustering , 2007, IEEE Transactions on Evolutionary Computation.

[2]  Sung-Bae Cho,et al.  Radial basis function neural networks: a topical state-of-the-art survey , 2016, Open Comput. Sci..

[3]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[4]  M. Rudemo Empirical Choice of Histograms and Kernel Density Estimators , 1982 .

[5]  Aristides Gionis,et al.  Clustering aggregation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[6]  Marina Meila,et al.  Comparing clusterings: an axiomatic view , 2005, ICML.

[7]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[8]  M. C. Jones,et al.  A Brief Survey of Bandwidth Selection for Density Estimation , 1996 .

[9]  Christian Böhm,et al.  Synchronization-Inspired Partitioning and Hierarchical Clustering , 2013, IEEE Transactions on Knowledge and Data Engineering.

[10]  Yaochu Jin,et al.  Surrogate-assisted evolutionary computation: Recent advances and future challenges , 2011, Swarm Evol. Comput..

[11]  Alessandro Laio,et al.  Clustering by fast search and find of density peaks , 2014, Science.

[12]  Michel Verleysen,et al.  On the Kernel Widths in Radial-Basis Function Networks , 2003, Neural Processing Letters.

[13]  Piotr A. Kowalski,et al.  Complete Gradient Clustering Algorithm for Features Analysis of X-Ray Images , 2010 .

[14]  Hong Wang,et al.  Shared-nearest-neighbor-based clustering by fast search and find of density peaks , 2018, Inf. Sci..

[15]  Li Ni,et al.  Swarm Clustering Algorithm: Let the Particles Fly for a while , 2018, 2018 IEEE Symposium Series on Computational Intelligence (SSCI).

[16]  Bernhard Sendhoff,et al.  A Multiobjective Evolutionary Algorithm Using Gaussian Process-Based Inverse Modeling , 2015, IEEE Transactions on Evolutionary Computation.

[17]  M. Rosenblatt Remarks on Some Nonparametric Estimates of a Density Function , 1956 .

[18]  Handing Wang,et al.  Data-Driven Surrogate-Assisted Multiobjective Evolutionary Optimization of a Trauma System , 2016, IEEE Transactions on Evolutionary Computation.

[19]  Claudia Plant,et al.  Clustering by synchronization , 2010, KDD.

[20]  Cuixia Li,et al.  A Weighted Fuzzy Clustering Algorithm Based on Density , 2012 .

[21]  Dan Guo,et al.  Data-Driven Evolutionary Optimization: An Overview and Case Studies , 2019, IEEE Transactions on Evolutionary Computation.

[22]  Bernhard Sendhoff,et al.  Individual-based Management of Meta-models for Evolutionary Optimization with Application to Three-Dimensional Blade Optimization , 2007, Evolutionary Computation in Dynamic and Uncertain Environments.

[23]  Swagatam Das,et al.  Automatic Clustering Using an Improved Differential Evolution Algorithm , 2007 .

[24]  Wilfrido Gómez-Flores,et al.  Automatic clustering using nature-inspired metaheuristics: A survey , 2016, Appl. Soft Comput..

[25]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[26]  H. Kile,et al.  Bandwidth Selection in Kernel Density Estimation , 2010 .

[27]  Friedhelm Schwenker,et al.  Three learning phases for radial-basis-function networks , 2001, Neural Networks.

[28]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[29]  Olvi L. Mangasarian,et al.  Nuclear feature extraction for breast tumor diagnosis , 1993, Electronic Imaging.

[30]  Koetsu Yamazaki,et al.  Simple estimate of the width in Gaussian kernel with adaptive scaling technique , 2011, Appl. Soft Comput..

[31]  Chunyan Yu,et al.  Clustering stability-based Evolutionary K-Means , 2019, Soft Comput..

[32]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[33]  Korris Fu-Lai Chung,et al.  Scaling Up Synchronization-Inspired Partitioning Clustering , 2014, IEEE Transactions on Knowledge and Data Engineering.

[34]  Casimir A. Kulikowski,et al.  Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning and Expert Systems , 1990 .

[35]  G. Terrell The Maximal Smoothing Principle in Density Estimation , 1990 .

[36]  Yuhui Shi,et al.  Particle swarm optimization: developments, applications and resources , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[37]  Parag M. Kanade,et al.  Fuzzy ants as a clustering concept , 2003, 22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003.

[38]  Yi Zhou,et al.  How many clusters? A robust PSO-based local density model , 2016, Neurocomputing.

[39]  Yang Yu,et al.  A two-layer surrogate-assisted particle swarm optimization algorithm , 2014, Soft Computing.

[40]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[41]  Shengxiang Yang,et al.  Ant Colony Stream Clustering: A Fast Density Clustering Algorithm for Dynamic Data Streams , 2019, IEEE Transactions on Cybernetics.

[42]  M. C. Jones,et al.  A reliable data-based bandwidth selection method for kernel density estimation , 1991 .

[43]  De-Shuang Huang,et al.  A mended hybrid learning algorithm for radial basis function neural networks to improve generalization capability , 2007 .

[44]  Leandro N. de Castro,et al.  Data Clustering with Particle Swarms , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[45]  Peter J. Rousseeuw,et al.  Clustering by means of medoids , 1987 .

[46]  Hui Xiong,et al.  Adapting the right measures for K-means clustering , 2009, KDD.

[47]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[48]  Chien-Hsing Chou,et al.  Fuzzy C-Means Algorithm with a Point Symmetry Distance , 2006 .

[49]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[50]  Rommel G. Regis,et al.  Evolutionary Programming for High-Dimensional Constrained Expensive Black-Box Optimization Using Radial Basis Functions , 2014, IEEE Transactions on Evolutionary Computation.

[51]  Roman Neruda,et al.  ASM-MOMA: Multiobjective memetic algorithm with aggregate surrogate model , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[52]  Avinash Agarwal,et al.  Radial Basis Function Artificial Neural Network: Spread Selection , 2012 .

[53]  Bernhard Sendhoff,et al.  A study on metamodeling techniques, ensembles, and multi-surrogates in evolutionary computation , 2007, GECCO '07.

[54]  Yew-Soon Ong,et al.  A study on polynomial regression and Gaussian process global surrogate model in hierarchical surrogate-assisted evolutionary algorithm , 2005, 2005 IEEE Congress on Evolutionary Computation.

[55]  Petros Koumoutsakos,et al.  Accelerating evolutionary algorithms with Gaussian process fitness function models , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[56]  Jian Pei,et al.  Data Mining: Concepts and Techniques, 3rd edition , 2006 .

[57]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[58]  Limin Fu,et al.  FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data , 2007, BMC Bioinformatics.

[59]  David S. Broomhead,et al.  Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..

[60]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[61]  Weixin Xie,et al.  An Efficient Global K-means Clustering Algorithm , 2011, J. Comput..

[62]  Mike Preuss,et al.  Niching the CMA-ES via nearest-better clustering , 2010, GECCO '10.

[63]  Bin Yang,et al.  Surrogate-Assisted Evolutionary Framework for Data-Driven Dynamic Optimization , 2019, IEEE Transactions on Emerging Topics in Computational Intelligence.

[64]  Handing Wang,et al.  Guest Editorial: Special Issue on Computational Intelligence in Data-Driven Optimization , 2019, IEEE Trans. Emerg. Top. Comput. Intell..

[65]  Pasi Fränti,et al.  Fast Agglomerative Clustering Using a k-Nearest Neighbor Graph , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[66]  A. Bowman An alternative method of cross-validation for the smoothing of density estimates , 1984 .

[67]  Ujjwal Maulik,et al.  Automatic Fuzzy Clustering Using Modified Differential Evolution for Image Classification , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[68]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[69]  Chen Jiang,et al.  A surrogate-assisted particle swarm optimization algorithm based on efficient global optimization for expensive black-box problems , 2018, Engineering Optimization.

[70]  James C. Bezdek,et al.  Genetic algorithm guided clustering , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[71]  Hung T. Nguyen,et al.  Data Clustering Using Variants of Rapid Centroid Estimation , 2014, IEEE Transactions on Evolutionary Computation.

[72]  Liang Gao,et al.  Ensemble of surrogates assisted particle swarm optimization of medium scale expensive problems , 2019, Appl. Soft Comput..

[73]  M. Narasimha Murty,et al.  Genetic K-means algorithm , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[74]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[75]  M. Wand,et al.  EXACT MEAN INTEGRATED SQUARED ERROR , 1992 .

[76]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[77]  Mohanad Albughdadi,et al.  Density-based particle swarm optimization algorithm for data clustering , 2018, Expert Syst. Appl..

[78]  Wenjian Luo,et al.  Community Detection by Fuzzy Relations , 2020, IEEE Transactions on Emerging Topics in Computing.

[79]  Yaochu Jin,et al.  A social learning particle swarm optimization algorithm for scalable optimization , 2015, Inf. Sci..

[80]  Cor J. Veenman,et al.  A Maximum Variance Cluster Algorithm , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[81]  Lei Zhang,et al.  A Surrogate-Assisted Multiobjective Evolutionary Algorithm for Large-Scale Task-Oriented Pattern Mining , 2019, IEEE Transactions on Emerging Topics in Computational Intelligence.

[82]  Xinyu Li,et al.  Surrogate-guided differential evolution algorithm for high dimensional expensive problems , 2019, Swarm Evol. Comput..

[83]  M. Cugmas,et al.  On comparing partitions , 2015 .

[84]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[85]  A. Keane,et al.  Evolutionary Optimization of Computationally Expensive Problems via Surrogate Modeling , 2003 .

[86]  Jianchao Zeng,et al.  Surrogate-Assisted Cooperative Swarm Optimization of High-Dimensional Expensive Problems , 2017, IEEE Transactions on Evolutionary Computation.

[87]  A. Jahangirian,et al.  A surrogate assisted evolutionary optimization method with application to the transonic airfoil design , 2010 .