A Centerness Peak Based Clustering Algorithm

The density peak based clustering algorithm is a recently proposed density based clustering approach. This algorithm treats the data corresponding to local density peaks as cluster centers and groups non-center data based on the density relationship among neighboring data. While being simple, this algorithm is shown to be effective and computationally efficient. In the density peak based algorithm, the data with the largest local density are selected as cluster centers. On one hand, the real cluster centers may not have the largest local density in clusters. On the other hand, this practice may discriminate against the clusters of small density. In this paper we propose to measure the centerness of data and treat the centerness peaks as cluster centers. The centerness of one data is evaluated by the distribution of nearest neighbors in the neighborhood, and it measures to which degree one data is surrounded by its nearest neighbors. We present a histogram based method to calculate the centerness, and show that the centerness measure solves the problem resulted from density difference among clusters in the density peak based algorithm. In addition, the cluster centers identified by centerness peaks are more consistent with human observation. Experiments on various datasets and comparisons with other algorithms illustrate the effectiveness of our algorithm.

[1]  Marcello Pelillo,et al.  Dominant Sets and Pairwise Clustering , 2007 .

[2]  Dimitrios Gunopulos,et al.  Automatic Subspace Clustering of High Dimensional Data , 2005, Data Mining and Knowledge Discovery.

[3]  Avrim Blum,et al.  Correlation Clustering , 2004, Machine Learning.

[4]  Cor J. Veenman,et al.  A Maximum Variance Cluster Algorithm , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Jitendra Malik,et al.  Spectral grouping using the Nystrom method , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Shaogang Gong,et al.  Constructing Robust Affinity Graphs for Spectral Clustering , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[9]  Aristides Gionis,et al.  Clustering Aggregation , 2005, ICDE.

[10]  D. Massart,et al.  Looking for natural patterns in data: Part 1. Density-based approach , 2001 .

[11]  G. Evanno,et al.  Detecting the number of clusters of individuals using the software structure: a simulation study , 2005, Molecular ecology.

[12]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[13]  Anil K. Jain,et al.  Data Clustering: A User's Dilemma , 2005, PReMI.

[14]  Maoguo Gong,et al.  Fuzzy C-Means Clustering With Local Information and Kernel Metric for Image Segmentation , 2013, IEEE Transactions on Image Processing.

[15]  Dit-Yan Yeung,et al.  Robust path-based spectral clustering , 2008, Pattern Recognit..

[16]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[18]  Charles T. Zahn,et al.  Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters , 1971, IEEE Transactions on Computers.

[19]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[20]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Elke Achtert,et al.  DeLi-Clu: Boosting Robustness, Completeness, Usability, and Efficiency of Hierarchical Clustering by a Closest Pair Ranking , 2006, PAKDD.

[22]  Xuelong Li,et al.  DSets-DBSCAN: A Parameter-Free Clustering Algorithm , 2016, IEEE Transactions on Image Processing.

[23]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[24]  Michael I. Jordan,et al.  Dimensionality Reduction for Spectral Clustering , 2011, AISTATS.

[25]  Amos Fiat,et al.  Correlation clustering in general weighted graphs , 2006, Theor. Comput. Sci..