Nonuniform Sparse Data Clustering Cascade Algorithm Based on Dynamic Cumulative Entropy

A small amount of prior knowledge and randomly chosen initial cluster centers have a direct impact on the accuracy of the performance of iterative clustering algorithm. In this paper we propose a new algorithm to compute initial cluster centers for -means clustering and the best number of the clusters with little prior knowledge and optimize clustering result. It constructs the Euclidean distance control factor based on aggregation density sparse degree to select the initial cluster center of nonuniform sparse data and obtains initial data clusters by multidimensional diffusion density distribution. Multiobjective clustering approach based on dynamic cumulative entropy is adopted to optimize the initial data clusters and the best number of the clusters. The experimental results show that the newly proposed algorithm has good performance to obtain the initial cluster centers for the -means algorithm and it effectively improves the clustering accuracy of nonuniform sparse data by about 5%.

[1]  Sergei V. Kalinin,et al.  Big Data Analytics for Scanning Transmission Electron Microscopy Ptychography , 2016, Scientific Reports.

[2]  Philippe Kastner,et al.  Ikaros mediates gene silencing in T cells through Polycomb repressive complex 2 , 2015, Nature Communications.

[3]  Taher Niknam,et al.  An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis , 2010, Appl. Soft Comput..

[4]  Elineudo Pinho de Moura,et al.  Classification of imbalance levels in a scaled wind turbine through detrended fluctuation analysis of vibration signals , 2016 .

[5]  Volkan Tunali,et al.  An ımproved clustering algorithm for text mining: multi-cluster spherical k-means , 2016, Int. Arab J. Inf. Technol..

[6]  Elahe Taherian Fard,et al.  A new hybrid imperialist competitive algorithm on data clustering , 2011 .

[7]  Erwie Zahara,et al.  A hybridized approach to data clustering , 2008, Expert Syst. Appl..

[8]  L. Hubert,et al.  Comparing partitions , 1985 .

[9]  George Michailidis,et al.  Critical limitations of consensus clustering in class discovery , 2014, Scientific Reports.

[10]  Josef Tvrdík,et al.  Hybrid differential evolution algorithm for optimal clustering , 2015, Appl. Soft Comput..

[11]  Lorenzo Livi,et al.  Two density-based k-means initialization algorithms for non-metric data clustering , 2014, Pattern Analysis and Applications.

[12]  Bart Vekemans,et al.  Assessment of Ovarian Cancer Tumors Treated with Intraperitoneal Cisplatin Therapy by Nanoscopic X-ray Fluorescence Imaging , 2016, Scientific Reports.

[13]  Jure Leskovec,et al.  Higher-order organization of complex networks , 2016, Science.

[14]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[15]  Stefan Bock,et al.  Pro-active real-time routing in applications with multiple request patterns , 2016, Eur. J. Oper. Res..

[16]  Shehroz S. Khan,et al.  Cluster center initialization algorithm for K-means clustering , 2004, Pattern Recognit. Lett..

[17]  Angela Garding,et al.  The transcriptome of mouse central nervous system myelin , 2016, Scientific Reports.

[18]  Z. Yakhini,et al.  Systematic discovery of cap-independent translation sequences in human and viral genomes , 2016, Science.

[19]  Ji-Xin Cheng,et al.  Vibrational spectroscopic imaging of living systems: An emerging platform for biology and medicine , 2015, Science.