A Potential-Based Clustering Method by Fast Search and Find of Cluster Centers

Clustering analysis is aimed at classifying elements into categories on the basis of their similarities. Its applications range from astronomy to bioinformatics and pattern recognition. Clustering by fast search and find of density peaks(CDP) is proposed by Alex Rodriguez and Alessandro Laio in the journal of Science. However, the density model used in this method cannot describe the local density of each data point very well. In addition, the algorithm requires a manually input parameter which is a cutoff distance, and the algorithm is sensitive to it. Although CDP can search and find density peaks fast, the accuracy of clustering results is not always satisfactory. In view of the above drawbacks, a potential-based clustering method by fast search and find of cluster centers(PFC) is proposed in this paper. By employing the potential field model, we build an edge-weighted tree which would lead to a better description of the information for each data point. By extensive experiments on three synthetic datasets and two real datasets, it is shown that the proposed algorithm has a better performance than the CDP algorithm and produces the most satisfactory results in most cases.

[1]  Zheng Weimin,et al.  Potential-based hierarchical clustering , 2002, Object recognition supported by user interaction for service robots.

[2]  Zi Huang,et al.  A Sparse Embedding and Least Variance Encoding Approach to Hashing , 2014, IEEE Transactions on Image Processing.

[3]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[4]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[5]  Jon M. Kleinberg,et al.  An Impossibility Theorem for Clustering , 2002, NIPS.

[6]  Maryam Lashkari,et al.  EXTENDED PSO ALGORITHM FOR IMPROVEMENT PROBLEMS K-MEANS CLUSTERING ALGORITHM , 2014 .

[7]  Yi Wan,et al.  Clustering by Sorting Potential Values (CSPV): A novel potential-based clustering method , 2012, Pattern Recognit..

[8]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[9]  Panos Kalnis,et al.  Discovery of Path Nearby Clusters in Spatial Networks , 2015, IEEE Transactions on Knowledge and Data Engineering.

[10]  Jing-Yu Yang,et al.  Density-based hierarchical clustering for streaming data , 2012, Pattern Recognit. Lett..

[11]  Shichao Zhang,et al.  Robust Joint Graph Sparse Coding for Unsupervised Spectral Feature Selection , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Yi Wan,et al.  PHA: A fast potential-based hierarchical agglomerative clustering method , 2013, Pattern Recognit..

[13]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[14]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[15]  Ji-Rong Wen,et al.  Finding regions of interest using location based social media , 2016, Neurocomputing.

[16]  C. Mallows,et al.  A Method for Comparing Two Hierarchical Clusterings , 1983 .

[17]  Xuelong Li,et al.  Block-Row Sparse Multiview Multilabel Learning for Image Classification , 2016, IEEE Transactions on Cybernetics.

[18]  Francesco Masulli,et al.  A survey of kernel and spectral methods for clustering , 2008, Pattern Recognit..

[19]  Andries Petrus Engelbrecht,et al.  An overview of clustering methods , 2007, Intell. Data Anal..