Two-phase clustering algorithm with density exploring distance measure

Here, the authors propose a novel two-phase clustering algorithm with a density exploring distance (DED) measure. In the first phase, the fast global K -means clustering algorithm is used to obtain the cluster number and the prototypes. Then, the prototypes of all these clusters and representatives of points belonging to these clusters are regarded as the input data set of the second phase. Afterwards, all the prototypes are clustered according to a DED measure which makes data points locating in the same structure to possess high similarity with each other. In experimental studies, the authors test the proposed algorithm on seven artificial as well as seven UCI data sets. The results demonstrate that the proposed algorithm is flexible to different data distributions and has a stronger ability in clustering data sets with complex non-convex distribution when compared with the comparison algorithms.

[1]  Le Hoang Son Generalized picture distance measure and applications to picture fuzzy clustering , 2016, Appl. Soft Comput..

[2]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[3]  Alexander Mendiburu,et al.  Similarity Measure Selection for Clustering Time Series Databases , 2016, IEEE Transactions on Knowledge and Data Engineering.

[4]  Matthias Hein,et al.  Measure Based Regularization , 2003, NIPS.

[5]  Jun Ye,et al.  Single-Valued Neutrosophic Clustering Algorithms Based on Similarity Measures , 2017, J. Classif..

[6]  Peter F. Stadler,et al.  Similarity-Based Segmentation of Multi-Dimensional Signals , 2017, Scientific Reports.

[7]  Dimitrios Charalampidis,et al.  A modified k-means algorithm for circular invariant clustering , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Yanchun Zhang,et al.  Fuzzy Consensus Clustering With Applications on Big Data , 2017, IEEE Transactions on Fuzzy Systems.

[9]  Zhao Kang,et al.  Twin Learning for Similarity and Clustering: A Unified Kernel Approach , 2017, AAAI.

[10]  Shunzhi Zhu,et al.  Data clustering with size constraints , 2010, Knowl. Based Syst..

[11]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[12]  Yonggang Lu,et al.  A novel travel-time based similarity measure for hierarchical clustering , 2016, Neurocomputing.

[13]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[14]  Lei Feng,et al.  Human Motion Segmentation via Robust Kernel Sparse Subspace Clustering , 2018, IEEE Transactions on Image Processing.

[15]  MengChu Zhou,et al.  A weight-incorporated similarity-based clustering ensemble method based on swarm intelligence , 2016, Knowl. Based Syst..

[16]  Chien-Hsing Chou,et al.  Short Papers , 2001 .

[17]  Isa Yildirim,et al.  Approximate spectral clustering with utilized similarity information using geodesic based hybrid distance measures , 2015, Pattern Recognit..

[18]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[19]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[20]  Liang Bai,et al.  A dissimilarity measure for the k-Modes clustering algorithm , 2012, Knowl. Based Syst..

[21]  Jing Liu,et al.  A similarity-based modularization quality measure for software module clustering problems , 2016, Inf. Sci..

[22]  Licheng Jiao,et al.  A Modified K-Means Clustering with a Density-Sensitive Distance Metric , 2006, RSKT.

[23]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[24]  Ujjwal Maulik,et al.  Genetic algorithm-based clustering technique , 2000, Pattern Recognit..

[25]  Leandro Nunes de Castro,et al.  Clustering algorithm selection by meta-learning systems: A new distance-based problem characterization and ranking combination methods , 2015, Inf. Sci..

[26]  Jiong Yang,et al.  STING: A Statistical Information Grid Approach to Spatial Data Mining , 1997, VLDB.

[27]  Yanchun Zhang,et al.  A robust iterative refinement clustering algorithm with smoothing search space , 2010, Knowl. Based Syst..

[28]  Qingsheng Zhu,et al.  Spectral clustering with density sensitive similarity function , 2011, Knowl. Based Syst..

[29]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[30]  King Ngi Ngan,et al.  Globally Measuring the Similarity of Superpixels by Binary Edge Maps for Superpixel Clustering , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[31]  Maoguo Gong,et al.  Density-Sensitive Evolutionary Clustering , 2007, PAKDD.