Density peak clustering based on relative density relationship

Abstract The density peak clustering algorithm treats local density peaks as cluster centers, and groups non-center data points by assuming that one data point and its nearest higher-density neighbor are in the same cluster. While this algorithm is shown to be promising in some applications, its clustering results are found to be sensitive to density kernels, and large density differences across clusters tend to result in wrong cluster centers. In this paper we attribute these problems to the inconsistency between the assumption and implementation adopted in this algorithm. While the assumption is based totally on relative density relationship, this algorithm adopts absolute density as one criterion to identify cluster centers. This observation prompts us to present a cluster center identification criterion based only on relative density relationship. Specifically, we define the concept of subordinate to describe the relative density relationship, and use the number of subordinates as a criterion to identify cluster centers. Our approach makes use of only relative density relationship and is less influenced by density kernels and density differences across clusters. In addition, we discuss the problems of two existing density kernels, and present an average-distance based kernel. In data clustering experiments we validate the new criterion and density kernel respectively, and then test the whole algorithm and compare with some other clustering algorithms.

[1]  William Zhu,et al.  A New Local Density for Density Peak Clustering , 2018, PAKDD.

[2]  Amos Fiat,et al.  Correlation clustering in general weighted graphs , 2006, Theor. Comput. Sci..

[3]  Yike Guo,et al.  Fast density clustering strategies based on the k-means algorithm , 2017, Pattern Recognit..

[4]  René Vidal,et al.  Structured Sparse Subspace Clustering: A Joint Affinity Learning and Subspace Clustering Framework , 2016, IEEE Transactions on Image Processing.

[5]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[6]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Chang-Dong Wang,et al.  Multi-Exemplar Affinity Propagation , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[9]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[10]  Dit-Yan Yeung,et al.  Robust path-based spectral clustering , 2008, Pattern Recognit..

[11]  Ryan A. Rossi,et al.  The Network Data Repository with Interactive Graph Analytics and Visualization , 2015, AAAI.

[12]  Helton Hideraldo Bíscaro,et al.  Hand movement recognition for Brazilian Sign Language: A study using distance-based neural networks , 2009, 2009 International Joint Conference on Neural Networks.

[13]  Shaogang Gong,et al.  Constructing Robust Affinity Graphs for Spectral Clustering , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Jian Yu,et al.  On convergence and parameter selection of the EM and DA-EM algorithms for Gaussian mixtures , 2018, Pattern Recognit..

[15]  Junbin Gao,et al.  Dual Graph Regularized Latent Low-Rank Representation for Subspace Clustering , 2015, IEEE Transactions on Image Processing.

[16]  Feiping Nie,et al.  The Constrained Laplacian Rank Algorithm for Graph-Based Clustering , 2016, AAAI.

[17]  Zhengming Ma,et al.  Adaptive density peak clustering based on K-nearest neighbors with aggregating strategy , 2017, Knowl. Based Syst..

[18]  Charles T. Zahn,et al.  Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters , 1971, IEEE Transactions on Computers.

[19]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[20]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Aristides Gionis,et al.  Clustering aggregation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[22]  Alessandro Laio,et al.  Clustering by fast search and find of density peaks , 2014, Science.

[23]  Pasi Fränti,et al.  Minimum spanning tree based split-and-merge: A hierarchical clustering method , 2011, Inf. Sci..

[24]  Pasi Fränti,et al.  Fast Agglomerative Clustering Using a k-Nearest Neighbor Graph , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Miin-Shen Yang,et al.  Robust-learning fuzzy c-means clustering algorithm with unknown number of clusters , 2017, Pattern Recognit..

[26]  Rongfang Bie,et al.  Clustering by fast search and find of density peaks via heat diffusion , 2016, Neurocomputing.

[27]  Hong Wang,et al.  Shared-nearest-neighbor-based clustering by fast search and find of density peaks , 2018, Inf. Sci..

[28]  Pasi Fränti,et al.  Iterative shrinking method for clustering problems , 2006, Pattern Recognit..

[29]  Brijnesh J. Jain,et al.  Consistency of mean partitions in consensus clustering , 2017, Pattern Recognit..

[30]  D. Massart,et al.  Looking for natural patterns in data: Part 1. Density-based approach , 2001 .

[31]  Marcello Pelillo,et al.  Dominant Sets and Pairwise Clustering , 2007 .

[32]  Cor J. Veenman,et al.  A Maximum Variance Cluster Algorithm , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[34]  Limin Fu,et al.  FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data , 2007, BMC Bioinformatics.

[35]  Jian Hou,et al.  An Enhanced Density Peak Based Clustering Algorithm , 2017, 2017 4th IAPR Asian Conference on Pattern Recognition (ACPR).

[36]  Xuelong Li,et al.  Multi-view Subspace Clustering , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37]  Piotr A. Kowalski,et al.  Complete Gradient Clustering Algorithm for Features Analysis of X-Ray Images , 2010 .

[38]  Anil K. Jain,et al.  Data Clustering: A User's Dilemma , 2005, PReMI.