A density-adaptive affinity propagation clustering algorithm based on spectral dimension reduction

Abstract As a novel clustering method, affinity propagation (AP) clustering can identify high-quality cluster centers by passing messages between data points. But its ultimate cluster number is affected by a user-defined parameter called self-confidence. When aiming at a given number of clusters due to prior knowledge, AP has to be launched many times until an appropriate setting of self-confidence is found. K-AP algorithm overcomes this disadvantage by introducing a constraint in the process of message passing to exploit the immediate results of K clusters. The key to K-AP clustering is constructing a suitable similarity matrix, which can truly reflect the intrinsic structure of the dataset. In this paper, a density-adaptive similarity measure is designed to describe the relations between data points more reasonably. Meanwhile, in order to solve the difficulties faced by K-AP algorithm in high-dimensional data sets, we use the dimension reduction method based on spectral graph theory to map the original data points to a low-dimensional eigenspace and propose a density-adaptive AP clustering algorithm based on spectral dimension reduction. Experiments show that the proposed algorithm can effectively deal with the clustering problem of datasets with complex structure and multiple scales, avoiding the singularity problem caused by the high-dimensional eigenvectors. Its clustering performance is better than AP clustering algorithm and K-AP algorithm.

[1]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[2]  Jun Dong,et al.  Affinity Propagation Clustering Based on Variable-Similarity Measure: Affinity Propagation Clustering Based on Variable-Similarity Measure , 2010 .

[3]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.

[4]  Weixin Xie,et al.  An Efficient Global K-means Clustering Algorithm , 2011, J. Comput..

[5]  Hongjie Jia,et al.  Research of semi-supervised spectral clustering algorithm based on pairwise constraints , 2012, Neural Computing and Applications.

[6]  Sun Ji,et al.  Clustering Algorithms Research , 2008 .

[7]  Xiangliang Zhang,et al.  K-AP: Generating Specified K Clusters by Efficient Affinity Propagation , 2010, 2010 IEEE International Conference on Data Mining.

[8]  J. Benedetto,et al.  Nonlinear Dimensionality Reduction via the ENH-LTSA Method for Hyperspectral Image Classification , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[9]  Michael I. Jordan,et al.  Learning Spectral Clustering , 2003, NIPS.

[10]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[11]  Stéphan Clémençon,et al.  Efficient eigen-updating for spectral graph clustering , 2013, Neurocomputing.

[12]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Spectral methods for graph clustering - A survey , 2011, Eur. J. Oper. Res..

[13]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[14]  Y. Jiang,et al.  Spectral Clustering on Multiple Manifolds , 2011, IEEE Transactions on Neural Networks.

[15]  Eric Y. Chuang,et al.  Estrogen receptor status prediction by gene component regression: a comparative study , 2014, Int. J. Data Min. Bioinform..

[16]  Ji-Gui Sun,et al.  Clustering Algorithms Research , 2008 .

[17]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[18]  Xingpeng Jiang,et al.  Comparison of Dimensional Reduction Methods for Detecting and Visualizing Novel Patterns in Human and Marine Microbiome , 2013, IEEE Transactions on NanoBioscience.

[19]  Hongtao Lu,et al.  Non-negative and sparse spectral clustering , 2014, Pattern Recognit..

[20]  Hong Zhu,et al.  Research of semi-supervised spectral clustering based on constraints expansion , 2012, Neural Computing and Applications.

[21]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[22]  Pierre-Alexandre Hébert,et al.  Constrained spectral embedding for K-way data clustering , 2013, Pattern Recognit. Lett..

[23]  Hujun Bao,et al.  A Regularized Approach for Geodesic-Based Semisupervised Multimanifold Learning , 2014, IEEE Transactions on Image Processing.