论文信息 - Data density based clustering

Data density based clustering

A new, data density based approach to clustering is presented which automatically determines the number of clusters. By using RDE for each data sample the number of calculations is significantly reduced in offline mode and, further, the method is suitable for online use. The clusters allow a different diameter per feature/dimension creating hyper-ellipsoid clusters which are axis-orthogonal. This results in a greater differentiation between clusters where the clusters are highly asymmetrical. We illustrate this with 3 standard data sets, 1 artificial dataset and a large real dataset to demonstrate comparable results to Subtractive, Hierarchical, K-Means, ELM and DBScan clustering techniques. Unlike subtractive clustering we do not iteratively calculate P however. Unlike hierarchical we do not need O(N2) distances to be calculated and a cut-off threshold to be defined. Unlike k-means we do not need to predefine the number of clusters. Using the RDE equations to calculate the densities the algorithm is efficient, and requires no iteration to approach the optimal result. We compare the proposed algorithm to k-means, subtractive, hierarchical, ELM and DBScan clustering with respect to several criteria. The results demonstrate the validity of the proposed approach.

Plamen Angelov | Richard Hyde | P. Angelov | Richard Hyde

[1] Plamen Angelov. Fundamentals of Probability Theory , 2012 .

[2] Stephen L. Chiu,et al. Fuzzy Model Identification Based on Cluster Estimation , 1994, J. Intell. Fuzzy Syst..

[3] Plamen P. Angelov,et al. Evolving local means method for clustering of streaming data , 2012, 2012 IEEE International Conference on Fuzzy Systems.

[4] R. Singer,et al. The Audubon Society field guide to North American mushrooms , 1981 .

[5] Talel Abdessalem,et al. Using data stream management systems to analyze electric power consumption data , 2007, Monde des Util. Anal. Données.

[6] S. Chiu,et al. A cluster estimation method with extension to fuzzy model identification , 1994, Proceedings of 1994 IEEE 3rd International Fuzzy Systems Conference.

[7] Hans-Peter Kriegel,et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[8] Dimitar Filev,et al. Generation of Fuzzy Rules by Mountain Clustering , 1994, J. Intell. Fuzzy Syst..