McDPC: multi-center density peak clustering

Density peak clustering (DPC) is a recently developed density-based clustering algorithm that achieves competitive performance in a non-iterative manner. DPC is capable of effectively handling clusters with single density peak (single center), i.e., based on DPC’s hypothesis, one and only one data point is chosen as the center of any cluster. However, DPC may fail to identify clusters with multiple density peaks (multi-centers) and may not be able to identify natural clusters whose centers have relatively lower local density. To address these limitations, we propose a novel clustering algorithm based on a hierarchical approach, named multi-center density peak clustering (McDPC). Firstly, based on a widely adopted hypothesis that the potential cluster centers are relatively far away from each other. McDPC obtains centers of the initial micro-clusters (named representative data points) whose minimum distance to the other higher-density data points are relatively larger. Secondly, the representative data points are autonomously categorized into different density levels. Finally, McDPC deals with micro-clusters at each level and if necessary, merges the micro-clusters at a specific level into one cluster to identify multi-center clusters. To evaluate the effectiveness of our proposed McDPC algorithm, we conduct experiments on both synthetic and real-world datasets and benchmark the performance of McDPC against other state-of-the-art clustering algorithms. We also apply McDPC to perform image segmentation and facial recognition to further demonstrate its capability in dealing with real-world applications. The experimental results show that our method achieves promising performance.

[1]  Hongjie Jia,et al.  Study on density peaks clustering based on k-nearest neighbors and principal component analysis , 2016, Knowl. Based Syst..

[2]  Guoyin Wang,et al.  DenPEHC: Density peak based efficient hierarchical clustering , 2016, Inf. Sci..

[3]  Nicholette D. Palmer,et al.  Novel genetic associations for blood pressure identified via gene-alcohol interaction in up to 570K individuals across multiple ancestries , 2018, PloS one.

[4]  James Bailey,et al.  Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance , 2010, J. Mach. Learn. Res..

[5]  Weixin Xie,et al.  Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors , 2016, Inf. Sci..

[6]  Bo Jiang,et al.  Automatic clustering based on density peak detection using generalized extreme value distribution , 2018, Soft Comput..

[7]  Yu Xue,et al.  A robust density peaks clustering algorithm using fuzzy neighborhood , 2017, International Journal of Machine Learning and Cybernetics.

[8]  Zhengming Ma,et al.  Adaptive density peak clustering based on K-nearest neighbors with aggregating strategy , 2017, Knowl. Based Syst..

[9]  P. Viswanath,et al.  l-DBSCAN : A Fast Hybrid Density Based Clustering Method , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[10]  Yike Guo,et al.  Fast density clustering strategies based on the k-means algorithm , 2017, Pattern Recognit..

[11]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[12]  Donghua Yu,et al.  Drug-target interaction data cluster analysis based on improving the density peaks clustering algorithm , 2019, Intell. Data Anal..

[13]  Brendan J. Frey,et al.  A Binary Variable Model for Affinity Propagation , 2009, Neural Computation.

[14]  Q. M. Jonathan Wu,et al.  Clothescounter: A framework for star-oriented clothes mining from videos , 2020, Neurocomputing.

[15]  Chengle Zhou,et al.  Hyperspectral anomaly detection via density peak clustering , 2020, Pattern Recognit. Lett..

[16]  Alessandro Laio,et al.  Clustering by fast search and find of density peaks , 2014, Science.

[17]  Faa-Jeng Lin,et al.  Intelligent PV Power Smoothing Control Using Probabilistic Fuzzy Neural Network with Asymmetric Membership Function , 2017 .

[18]  Tommy W. S. Chow,et al.  Tree2Vector: Learning a Vectorial Representation for Tree-Structured Data , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[20]  Fan Meng,et al.  A novel clustering-based image segmentation via density peaks algorithm with mid-level feature , 2017, Neural Computing and Applications.

[21]  Dit-Yan Yeung,et al.  Robust path-based spectral clustering , 2008, Pattern Recognit..

[22]  Mahua Bhattacharya,et al.  A density invariant approach to clustering , 2017, Neural Computing and Applications.

[23]  Hans-Peter Kriegel,et al.  Hierarchical density-based clustering of uncertain data , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[24]  Cor J. Veenman,et al.  A Maximum Variance Cluster Algorithm , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Peijie Lin,et al.  A Density Peak-Based Clustering Approach for Fault Diagnosis of Photovoltaic Arrays , 2017 .

[26]  Yanhui Guo,et al.  A novel image segmentation approach based on neutrosophic c-means clustering and indeterminacy filtering , 2017, Neural Computing and Applications.

[27]  Chunyan Miao,et al.  REDPC: A residual error-based density peak clustering algorithm , 2019, Neurocomputing.

[28]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[29]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[30]  Zhu-Hong You,et al.  Plant disease leaf image segmentation based on superpixel clustering and EM algorithm , 2017, Neural Computing and Applications.

[31]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[32]  Mengmeng Wang,et al.  An improved density peaks-based clustering method for social circle discovery in social networks , 2016, Neurocomputing.

[33]  Xiao Xu,et al.  An entropy-based density peaks clustering algorithm for mixed type data employing fuzzy neighborhood , 2017, Knowl. Based Syst..

[34]  You Zhou,et al.  Density propagation based adaptive multi-density clustering algorithm , 2018, PloS one.