Mass-Based Density Peaks Clustering Algorithm

Density peaks clustering algorithm (DPC) relies on local-density and relative-distance of dataset to find cluster centers. However, the calculation of these attributes is based on Euclidean distance simply, and DPC is not satisfactory when dataset’s density is uneven or dimension is higher. In addition, parameter \( d_{\text{c}} \) only considers the global distribution of the dataset, a little change of \( d_{\text{c}} \) has a great influence on small-scale dataset clustering. Aiming at these drawbacks, this paper proposes a mass-based density peaks clustering algorithm (MDPC). MDPC introduces a mass-based similarity measure method to calculate the new similarity matrix. After that, K-nearest neighbour information of the data is obtained according to the new similarity matrix, and then MDPC redefines the local density based on the K-nearest neighbour information. Experimental results show that MDPC is superior to DPC, and satisfied on datasets with uneven density and higher dimensions, which also avoids the influence of \( d_{\text{c}} \) on the small-scale datasets.

[1]  Hong Wang,et al.  Shared-nearest-neighbor-based clustering by fast search and find of density peaks , 2018, Inf. Sci..

[2]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[3]  Min Wang,et al.  Active learning through density clustering , 2017, Expert Syst. Appl..

[4]  Julien Jacques,et al.  Model-based co-clustering for functional data , 2016, Neurocomputing.

[5]  Gholamreza Haffari,et al.  Mp-Dissimilarity: A Data Dependent Dissimilarity Measure , 2014, 2014 IEEE International Conference on Data Mining.

[6]  Yike Guo,et al.  Fast density clustering strategies based on the k-means algorithm , 2017, Pattern Recognit..

[7]  C. Krumhansl Concerning the Applicability of Geometric Models to Similarity Data : The Interrelationship Between Similarity and Spatial Density , 2005 .

[8]  Gholamreza Haffari,et al.  Half-space mass: a maximally robust and efficient data depth method , 2015, Machine Learning.

[9]  Xiao Xu,et al.  DPCG: an efficient density peaks clustering algorithm based on grid , 2018, Int. J. Mach. Learn. Cybern..

[10]  Weixin Xie,et al.  Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors , 2016, Inf. Sci..

[11]  Timo Hämäläinen,et al.  Revealing community structures by ensemble clustering using group diffusion , 2018, Inf. Fusion.

[12]  Xiao Xu,et al.  An entropy-based density peaks clustering algorithm for mixed type data employing fuzzy neighborhood , 2017, Knowl. Based Syst..

[13]  Paul D. McNicholas,et al.  Clustering, classification, discriminant analysis, and dimension reduction via generalized hyperbolic mixtures , 2013, Comput. Stat. Data Anal..

[14]  Hongjie Jia,et al.  Study on density peaks clustering based on k-nearest neighbors and principal component analysis , 2016, Knowl. Based Syst..

[15]  Alessandro Laio,et al.  Clustering by fast search and find of density peaks , 2014, Science.

[16]  Fan Meng,et al.  A novel clustering-based image segmentation via density peaks algorithm with mid-level feature , 2017, Neural Computing and Applications.

[17]  Pei Chen,et al.  Delta-density based clustering with a divide-and-conquer strategy: 3DC clustering , 2016, Pattern Recognit. Lett..

[18]  Zhi-Hua Zhou,et al.  Overcoming Key Weaknesses of Distance-based Neighbourhood Methods using a Data Dependent Dissimilarity Measure , 2016, KDD.