A new local density and relative distance based spectrum clustering

A novel local density and relative distance-based spectrum clustering (LDRDSC) algorithm is proposed for multidimensional data clustering. The density spectra consider both redefined local densities and relative distances. The spectral peaks are defined as cluster centers since these peaks correspond to the local density maximums. Different clusters correspond to different spectra. The clustering by fast search and find of density peaks (CFSFDP) algorithm and several benchmark data sets are employed to validate our proposed LDRDSC algorithm. Once the density spectrum is generated, the rest points can be automatically clustered by our LDRDSC algorithm, which is different from CFSFDP. CFSFDP needs to categorize data points according to the cluster centers. Furthermore, our LDRDSC algorithm is compared with other five typical clustering algorithms (DBSCAN, FCM, AP, Mean Shift and k-means) in order to validate the effectiveness of the proposed algorithm. Computational results demonstrate that our algorithm can obtain a better clustering result than the above mentioned algorithms, especially in identifying noises or isolates.

[1]  Anand Singh Jalal,et al.  A Density Based Algorithm for Discovering Density Varied Clusters in Large Spatial Databases , 2010 .

[2]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[3]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[4]  Mohand Said Djouadi,et al.  Using a Light DBSCAN Algorithm for Visual Surveillance of Crowded Traffic Scenes , 2015 .

[5]  Mingzhe Liu,et al.  Evolving RBF neural networks for rainfall prediction using hybrid particle swarm optimization and genetic algorithm , 2015, Neurocomputing.

[6]  Christian S. Jensen,et al.  Effective Online Group Discovery in Trajectory Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[7]  Efendi N. Nasibov,et al.  Robustness of density-based clustering methods with various neighborhood relations , 2009, Fuzzy Sets Syst..

[8]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..

[9]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[10]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[11]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[12]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[13]  Tamalika Chaira,et al.  A novel intuitionistic fuzzy C means clustering algorithm and its application to medical images , 2011, Appl. Soft Comput..

[14]  Babji Srinivasan,et al.  Fast and accurate lithography simulation using cluster analysis in resist model building , 2015 .

[15]  Le Minh Kieu,et al.  A modified Density-Based Scanning Algorithm with Noise for spatial travel pattern analysis from Smart Card AFC data , 2015 .

[16]  Alexander Hinneburg,et al.  DENCLUE 2.0: Fast Clustering Based on Kernel Density Estimation , 2007, IDA.

[17]  Joydeep Ghosh,et al.  Automated Hierarchical Density Shaving: A Robust Automated Clustering and Visualization Framework for Large Biological Data Sets , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Xinquan Chen,et al.  A new clustering algorithm based on near neighbor influence , 2014, Expert Syst. Appl..

[19]  Aidong Zhang,et al.  An adaptive density-based clustering algorithm for spatial database with noise , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[20]  Hans-Peter Kriegel,et al.  A Fast Parallel Clustering Algorithm for Large Spatial Databases , 1999, Data Mining and Knowledge Discovery.

[21]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[22]  Chi-Hoon Lee,et al.  Clustering spatial data in the presence of obstacles: a density-based approach , 2002, Proceedings International Database Engineering and Applications Symposium.

[23]  Peng Liu,et al.  VDBSCAN: Varied Density Based Spatial Clustering of Applications with Noise , 2007, 2007 International Conference on Service Systems and Service Management.

[24]  Hichem Frigui,et al.  A Robust Competitive Clustering Algorithm With Applications in Computer Vision , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Frédéric Jurie,et al.  Randomized Clustering Forests for Image Classification , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Ganapati Panda,et al.  Design of computationally efficient density-based clustering algorithms , 2015, Data Knowl. Eng..

[28]  Alain Bretto,et al.  A reductive approach to hypergraph clustering: An application to image segmentation , 2012, Pattern Recognit..

[29]  María José del Jesús,et al.  KEEL: a software tool to assess evolutionary algorithms for data mining problems , 2008, Soft Comput..

[30]  René Vidal,et al.  Sparse subspace clustering , 2009, CVPR.

[31]  Xun Wang,et al.  Review on mining data from multiple data sources , 2018, Pattern Recognit. Lett..

[32]  Han Qi,et al.  A new method to estimate ages of facial image for large database , 2015, Multimedia Tools and Applications.

[33]  Y. Tu,et al.  Both piston-like and rotational motions are present in bacterial chemoreceptor signaling , 2015, Scientific Reports.

[34]  Huan Liu,et al.  '1+1>2': merging distance and density based clustering , 2001, Proceedings Seventh International Conference on Database Systems for Advanced Applications. DASFAA 2001.

[35]  Qinbao Song,et al.  Revealing Density-Based Clustering Structure from the Core-Connected Tree of a Network , 2013, IEEE Transactions on Knowledge and Data Engineering.

[36]  Hans-Peter Kriegel,et al.  Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications , 1998, Data Mining and Knowledge Discovery.