论文信息 - K-nearest neighbors optimized clustering algorithm by fast search and finding the density peaks of a dataset

K-nearest neighbors optimized clustering algorithm by fast search and finding the density peaks of a dataset

In order to overcome the de ciencies of the clustering algorithm published in Science in June 2014 for performing fast searches and nding density peaks, a new K-nearest neighbors-based clustering algorithm is pro-posed in this paper. The proposed algorithm de nes the local density of a point based on its K-nearest neighbors, and thus the density peaks can be found accurately. Then, two new assignment strategies for points are developed based on the K-nearest neighbors theory. The density peaks are regarded as the initial centers for clusters, and the two new assignment strategies are, in turn, used to assign points to the density peaks in order to construct the clusters of a dataset. The theory analyzes the principles of the new algorithm, and thorough experiments on several popular test cases that include both synthetic and real-world datasets from the UCI machine learning repository and Olivetti face database, we demonstrate that the proposed K-nearest neighbors-based clustering algorithm can detect the right cluster centers by searching for density peaks. Furthermore, we show that the proposed algorithm can recognize clusters regardless of their shape and the dimensionality of the space where the clusters are embedded, independently from the size of the dataset. The algorithm is also robust to outliers, and it outperforms the original algorithm published in Science and other popular clustering algorithms, including AP, DBSCAN, and K-means. The new algorithm is a powerful clustering algorithm that can be used to discover the patterns and rules hidden in real-world datasets.

Xie Juan-ying | Xie Weixin | Gao Hongchao