Mk-NNG-DPC: density peaks clustering based on improved mutual K-nearest-neighbor graph

Clustering by fast search and detection of density peaks (DPC, Density Peaks Clustering) is a relatively novel clustering algorithm published in the Science journal. As a density-based clustering algorithm, DPC produces better clustering results while using less parameters than other relevant algorithms. However, we found that the DPC algorithm does not perform well if clusters with different densities are very close. To address this problem, we propose a new DPC algorithm by incorporating an improved mutual k-nearest-neighbor graph (Mk-NNG) into DPC. Our Mk-NNG-DPC algorithm leverages the distance matrix of data samples to improve the Mk-NNG, and then utilizes DPC to constrain and select cluster centers. The proposed Mk-NNG-DPC algorithm ensures an instance to be allocated to the fittest cluster. Experimental results on synthetic and real world datasets show that our Mk-NNG-DPC algorithm can effectively and efficiently improve clustering performance, even for clusters with arbitrary shapes.

[1]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Vandana Bhattacherjee,et al.  A Modified K-Modes Clustering Algorithm , 2013, PReMI.

[3]  Jiancong Fan,et al.  OPE-HCA: an optimal probabilistic estimation approach for hierarchical clustering algorithm , 2015, Neural Computing and Applications.

[4]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[5]  Jian Pei,et al.  2012- Data Mining. Concepts and Techniques, 3rd Edition.pdf , 2012 .

[6]  E. Forgy,et al.  Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[7]  Alexander Hinneburg,et al.  DENCLUE 2.0: Fast Clustering Based on Kernel Density Estimation , 2007, IDA.

[8]  Yishay Mansour,et al.  An Information-Theoretic Analysis of Hard and Soft Assignment Methods for Clustering , 1997, UAI.

[9]  Chuangxia Huang,et al.  Global exponential convergence in a delayed almost periodic Nicholson's blowflies model with discontinuous harvesting , 2018 .

[10]  Peilin Yang,et al.  An overlapping community detection algorithm based on density peaks , 2017, Neurocomputing.

[11]  Xiaodi Huang,et al.  A Fast Algorithm for Finding Correlation Clusters in Noise Data , 2007, PAKDD.

[12]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[13]  Lihong Huang,et al.  Periodic attractor for reaction-diffusion high-order Hopfield neural networks with time-varying delays , 2017, Comput. Math. Appl..

[14]  Ricardo J. G. B. Campello,et al.  Density-Based Clustering Based on Hierarchical Density Estimates , 2013, PAKDD.

[15]  Ran Wang,et al.  Noniterative Deep Learning: Incorporating Restricted Boltzmann Machine Into Multilayer Random Weight Neural Networks , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[16]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[17]  Ran Wang,et al.  Discovering the Relationship Between Generalization and Uncertainty by Incorporating Complexity of Classification , 2018, IEEE Transactions on Cybernetics.

[18]  Xie Juan-ying,et al.  K-nearest neighbors optimized clustering algorithm by fast search and finding the density peaks of a dataset , 2016 .

[19]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[20]  Tommy W. S. Chow,et al.  Tree2Vector: Learning a Vectorial Representation for Tree-Structured Data , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[21]  Xiao Xu,et al.  Density peaks clustering using geodesic distances , 2017, International Journal of Machine Learning and Cybernetics.

[22]  Chien-Ming Chen,et al.  A Robust Mutual Authentication with a Key Agreement Scheme for Session Initiation Protocol , 2018, Applied Sciences.

[23]  Chuangxia Huang,et al.  Stability Analysis of SIR Model with Distributed Delay on Complex Networks , 2016, PloS one.

[24]  Jeng-Shyang Pan,et al.  A Novel Rough Fuzzy Clustering Algorithm with A New Similarity Measurement , 2019 .

[25]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[26]  Josiane Mothe,et al.  Word sense discrimination in information retrieval: A spectral clustering-based approach , 2015, Inf. Process. Manag..

[27]  Lu Yang,et al.  Mining of skyline patterns by considering both frequent and utility constraints , 2019, Eng. Appl. Artif. Intell..

[28]  Xiao Xu,et al.  DPCG: an efficient density peaks clustering algorithm based on grid , 2018, Int. J. Mach. Learn. Cybern..

[29]  Sam Kwong,et al.  Incorporating Diversity and Informativeness in Multiple-Instance Active Learning , 2017, IEEE Transactions on Fuzzy Systems.

[30]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[31]  Bo Yuan,et al.  Density-Based Multiscale Analysis for Clustering in Strong Noise Settings With Varying Densities , 2018, IEEE Access.

[32]  Vipin Kumar,et al.  Chameleon: Hierarchical Clustering Using Dynamic Modeling , 1999, Computer.

[33]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Zhongzhi Shi,et al.  Unsupervised extreme learning machine with representational features , 2015, International Journal of Machine Learning and Cybernetics.

[35]  Naixue Xiong,et al.  Consistency Maintenance of Collaborative Shared Documents in Unstable Network Environment , 2019 .

[36]  Witold Pedrycz,et al.  A Study on Relationship Between Generalization Abilities and Fuzziness of Base Classifiers in Ensemble Learning , 2015, IEEE Transactions on Fuzzy Systems.

[37]  Yu Xue,et al.  A robust density peaks clustering algorithm using fuzzy neighborhood , 2017, International Journal of Machine Learning and Cybernetics.

[38]  Yang Li,et al.  RoughPSO: rough set-based particle swarm optimisation , 2018 .

[39]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[40]  Guangliang Chen,et al.  Spectral clustering based on local linear approximations , 2010, 1001.1323.

[41]  Zhongying Zhao,et al.  Probability model selection and parameter evolutionary estimation for clustering imbalanced data without sampling , 2016, Neurocomputing.

[42]  Yining Liu,et al.  A Secure Authentication Protocol for Internet of Vehicles , 2019, IEEE Access.

[43]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[44]  Chuangxia Huang,et al.  Periodicity of non-autonomous inertial neural networks involving proportional delays and non-reduced order method , 2019, International Journal of Biomathematics.

[45]  Chuangxia Huang,et al.  Global Convergence on Asymptotically Almost Periodic SICNNs with Nonlinear Decay Functions , 2018, Neural Processing Letters.

[46]  Fu Guoyao,et al.  Optimization methods for fuzzy clustering , 1998 .

[47]  Raj Bhatnagar,et al.  Graph Clustering Using Mutual K-Nearest Neighbors , 2014, AMT.

[48]  Fei Wang,et al.  Spectral Clustering for Time Series , 2005, ICAPR.

[49]  Alessandro Laio,et al.  Clustering by fast search and find of density peaks , 2014, Science.

[50]  Himansu Sekhar Behera,et al.  Fuzzy C-Means (FCM) Clustering Algorithm: A Decade Review from 2000 to 2014 , 2015 .

[51]  M. R. Brito,et al.  Connectivity of the mutual k-nearest-neighbor graph in clustering and outlier detection , 1997 .

[52]  Lihong Huang,et al.  Almost periodicity analysis for a delayed Nicholson's blowflies model with nonlinear density-dependent mortality term , 2019, Communications on Pure & Applied Analysis.

[53]  Gábor J. Székely,et al.  Hierarchical Clustering via Joint Between-Within Distances: Extending Ward's Minimum Variance Method , 2005, J. Classif..

[54]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .