REDPC: A residual error-based density peak clustering algorithm

Abstract The density peak clustering (DPC) algorithm was designed to identify arbitrary-shaped clusters by finding density peaks in the underlying dataset. Due to its aptitudes of relatively low computational complexity and a small number of control parameters in use, DPC soon became widely adopted. However, because DPC takes the entire data space into consideration during the computation of local density, which is then used to generate a decision graph for the identification of cluster centroids, DPC may face difficulty in differentiating overlapping clusters and in dealing with low-density data points. In this paper, we propose a residual error-based density peak clustering algorithm named REDPC to better handle datasets comprising various data distribution patterns. Specifically, REDPC adopts the residual error computation to measure the local density within a neighbourhood region. As such, comparing to DPC, our REDPC algorithm provides a better decision graph for the identification of cluster centroids and better handles the low-density data points. Experimental results on both synthetic and real-world datasets show that REDPC performs better than DPC and other algorithms.

[1]  Yuan Zhang,et al.  Fuzzy clustering with the entropy of attribute weights , 2016, Neurocomputing.

[2]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[3]  Ruiqiang He,et al.  Discriminative and coherent subspace clustering , 2018, Neurocomputing.

[4]  Rongfang Bie,et al.  Clustering by fast search and find of density peaks via heat diffusion , 2016, Neurocomputing.

[5]  Xinge You,et al.  A Batch Rival Penalized Expectation-Maximization Algorithm for Gaussian Mixture Clustering with Automatic Model Selection , 2012, Comput. Math. Methods Medicine.

[6]  Meng Wang,et al.  Image clustering based on sparse patch alignment framework , 2014, Pattern Recognit..

[7]  Bo Yuan,et al.  Efficient distributed clustering using boundary information , 2018, Neurocomputing.

[8]  Ah-Hwee Tan,et al.  Self-regulated incremental clustering with focused preferences , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[9]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[10]  Di Wang,et al.  Ovarian cancer diagnosis using a hybrid intelligent system with simple yet convincing rules , 2014, Appl. Soft Comput..

[11]  Ujjwal Maulik,et al.  A Survey of Multiobjective Evolutionary Algorithms for Data Mining: Part I , 2014, IEEE Transactions on Evolutionary Computation.

[12]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[13]  Tao Chen,et al.  Model-based multidimensional clustering of categorical data , 2012, Artif. Intell..

[14]  Tinghuai Ma,et al.  An efficient and scalable density-based clustering algorithm for datasets with complex structures , 2016, Neurocomputing.

[15]  Yi Peng,et al.  Evaluation of clustering algorithms for financial risk analysis using MCDM methods , 2014, Inf. Sci..

[16]  Ujjwal Maulik,et al.  Survey of Multiobjective Evolutionary Algorithms for Data Mining: Part II , 2014, IEEE Transactions on Evolutionary Computation.

[17]  Chen Xu,et al.  Identification of cell types from single-cell transcriptomes using a novel clustering method , 2015, Bioinform..

[18]  Mengmeng Wang,et al.  An improved density peaks-based clustering method for social circle discovery in social networks , 2016, Neurocomputing.

[19]  Di Wang,et al.  Bank failure prediction using an accurate and interpretable neural fuzzy inference system , 2016, AI Commun..

[20]  C. Quek,et al.  MS-TSKfnn: novel Takagi-Sugeno-Kang fuzzy neural network using ART like clustering , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[21]  Manel Guerrero Zapata,et al.  A fuzzy anomaly detection system based on hybrid PSO-Kmeans algorithm in content-centric networks , 2015, Neurocomputing.

[22]  Meng Wang,et al.  Multimodal Deep Autoencoder for Human Pose Recovery , 2015, IEEE Transactions on Image Processing.

[23]  Jing Liu,et al.  Clustering-Guided Sparse Structural Learning for Unsupervised Feature Selection , 2014, IEEE Transactions on Knowledge and Data Engineering.

[24]  Xiangli Li,et al.  A clustering algorithm using skewness-based boundary detection , 2018, Neurocomputing.

[25]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[26]  Zahir Tari,et al.  A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis , 2014, IEEE Transactions on Emerging Topics in Computing.

[27]  Laurence T. Yang,et al.  Data Mining for Internet of Things: A Survey , 2014, IEEE Communications Surveys & Tutorials.

[28]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.