Adaptive fuzzy clustering by fast search and find of density peaks

Clustering by fast search and find of density peaks (CFSFDP) is proposed to cluster the data by finding of density peaks. CFSFDP is based on two assumptions that: a cluster center is a high dense data point as compared to its surrounding neighbors, and it lies at a large distance from other cluster centers. Based on these assumptions, CFSFDP supports a heuristic approach, known as decision graph to manually select cluster centers. Manual selection of cluster centers is a big limitation of CFSFDP in intelligent data analysis. In this paper, we proposed a fuzzy-CFSFDP method for adaptively selecting the cluster centers, effectively. It uses the fuzzy rules, based on aforementioned assumption for the selection of cluster centers. We performed a number of experiments on nine synthetic clustering datasets and compared the resulting clusters with the state-of-the-art methods. Clustering results and the comparisons of synthetic data validate the robustness and effectiveness of proposed fuzzy-CFSFDP method.

[1]  Peng Liu,et al.  VDBSCAN: Varied Density Based Spatial Clustering of Applications with Noise , 2007, 2007 International Conference on Service Systems and Service Management.

[2]  Kun Li,et al.  Personalized multi-modality image management and search for mobile devices , 2013, Personal and Ubiquitous Computing.

[3]  Hamid Sharif,et al.  A Survey on Cyber Security for Smart Grid Communications , 2012, IEEE Communications Surveys & Tutorials.

[4]  Rongfang Bie,et al.  Clustering by fast search and find of density peaks via heat diffusion , 2016, Neurocomputing.

[5]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[6]  Limin Fu,et al.  FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data , 2007, BMC Bioinformatics.

[7]  Hae-Sang Park,et al.  A simple and fast algorithm for K-medoids clustering , 2009, Expert Syst. Appl..

[8]  Aristides Gionis,et al.  Clustering Aggregation , 2005, ICDE.

[9]  Lovely Sharma,et al.  A Review on Density based Clustering Algorithms for Very Large Datasets , 2013 .

[10]  Dit-Yan Yeung,et al.  Robust path-based spectral clustering , 2008, Pattern Recognit..

[11]  Derya Birant,et al.  ST-DBSCAN: An algorithm for clustering spatial-temporal data , 2007, Data Knowl. Eng..

[12]  Fionn Murtagh,et al.  Algorithms for hierarchical clustering: an overview , 2012, WIREs Data Mining Knowl. Discov..

[13]  Tao Chen,et al.  Model-based multidimensional clustering of categorical data , 2012, Artif. Intell..

[14]  Gang Wang,et al.  Discriminative multi-manifold analysis for face recognition from a single training sample per person , 2011, 2011 International Conference on Computer Vision.

[15]  Pasi Fränti,et al.  Fast Agglomerative Clustering Using a k-Nearest Neighbor Graph , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[17]  Dale Schuurmans,et al.  Linear Coherent Bi-Clustering via Beam Searching and Sample Set Clustering , 2012, Discret. Math. Algorithms Appl..

[18]  Guohui Lin,et al.  Clustering Binary Oligonucleotide Fingerprint Vectors for DNA Clone Classification Analysis , 2005, J. Comb. Optim..

[19]  Daniel Jaeger,et al.  pyGCluster, a novel hierarchical clustering approach , 2014, Bioinform..

[20]  Leonid Portnoy,et al.  Intrusion detection with unlabeled data using clustering , 2000 .

[21]  Jiwen Lu,et al.  Neighborhood repulsed metric learning for kinship verification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[24]  Aristides Gionis,et al.  Clustering aggregation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[25]  Julien Jacques,et al.  Functional data clustering: a survey , 2013, Advances in Data Analysis and Classification.

[26]  Katharina Gaus,et al.  Analysis of Nanoscale Protein Clustering with Quantitative Localization Microscopy , 2015 .

[27]  Sang-Yeob Oh,et al.  Robust vocabulary recognition clustering model using an average estimator least mean square filter in noisy environments , 2013, Personal and Ubiquitous Computing.

[28]  Na Chen,et al.  Hierarchical hesitant fuzzy K-means clustering algorithm , 2014, Applied Mathematics-A Journal of Chinese Universities.

[29]  Matthew Karl Ellis Shaw,et al.  K-means clustering with automatic determination of K using a Multiobjective Genetic Algorithm with applications to microarray gene expression data , 2015 .

[30]  Peter Rossmanith,et al.  Exact algorithms for problems related to the densest k-set problem , 2014, Inf. Process. Lett..

[31]  M. Parimala,et al.  A Survey on Density Based Clustering Algorithms for Mining Large Spatial Databases , 2011 .

[32]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[33]  Navneet Kaur,et al.  Survey Paper on Clustering Techniques , 2013 .

[34]  Jiwen Lu,et al.  Learning Compact Binary Face Descriptor for Face Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[36]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[37]  Cor J. Veenman,et al.  A Maximum Variance Cluster Algorithm , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Jiawei Han,et al.  ACM Transactions on Knowledge Discovery from Data: Introduction , 2007 .

[39]  W. John Wilbur,et al.  Retro: concept-based clustering of biomedical topical sets , 2014, Bioinform..

[40]  Glory H. Shah,et al.  An Empirical Evaluation of Density-Based Clustering Techniques , 2012 .

[41]  Christopher. Simons,et al.  Machine learning with Python , 2017 .

[42]  Pasi Fränti,et al.  Dynamic Local Search for Clustering with Unknown Number of Clusters , 2002, ICPR.

[43]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[44]  Jiwen Lu,et al.  Cost-Sensitive Local Binary Feature Learning for Facial Age Estimation , 2015, IEEE Transactions on Image Processing.

[45]  Pasi Fränti,et al.  Iterative shrinking method for clustering problems , 2006, Pattern Recognit..

[46]  Takashi Ishida,et al.  Faster sequence homology searches by clustering subsequences , 2014, Bioinform..

[47]  Yingshu Li,et al.  Real time clustering of sensory data in wireless sensor networks , 2009, 2009 IEEE 28th International Performance Computing and Communications Conference.

[48]  Suresh Chandra Satapathy,et al.  Partition Based Clustering Using Genetic Algorithm and Teaching Learning Based Optimization: Performance Analysis , 2015 .

[49]  WangGang,et al.  Discriminative Multimanifold Analysis for Face Recognition from a Single Training Sample per Person , 2013 .

[50]  Pasi Fränti,et al.  A Dynamic local search algorithm for the clustering problem , 2002 .

[51]  Chen Xu,et al.  Identification of cell types from single-cell transcriptomes using a novel clustering method , 2015, Bioinform..