A Hybrid Clustering Algorithm Based on Rough Set and Shared Nearest Neighbors

In this paper, a hybrid method combining rough set and shared nearest neighbor algorithms is proposed for data clustering with non-globular shapes. The rough k-means algorithm is based on the distances between data and cluster centers. It partitions a data set with globular shapes well, but when the data are non-globular shapes, the results obtained by a rough k-means algorithm are not very satisfactory. In order to resolve this problem, a combined rough set and shared nearest neighbor algorithm is proposed. The proposed algorithm first adopts a shared nearest neighbor algorithm to evaluate the similarity among data, then the lower and upper approximations of a rough set algorithm are used to partition the data set into clusters.

[1]  Jen-Tzung Chien Decision tree State tying using cluster validity criteria , 2005, IEEE Transactions on Speech and Audio Processing.

[2]  Ray A. Jarvis,et al.  Clustering Using a Similarity Measure Based on Shared Near Neighbors , 1973, IEEE Transactions on Computers.

[3]  P. Lingras,et al.  Interval clustering using fuzzy and rough set theory , 2004, IEEE Annual Meeting of the Fuzzy Information, 2004. Processing NAFIPS '04..

[4]  Jian Pei,et al.  Data Mining: Concepts and Techniques, 3rd edition , 2006 .

[5]  Pasi Fränti,et al.  Fast Agglomerative Clustering Using a k-Nearest Neighbor Graph , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Hui Han,et al.  Fuzzy-rough k-nearest neighbor algorithm for imbalanced data sets learning , 2010, 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery.

[7]  Georg Peters,et al.  Some refinements of rough k-means clustering , 2006, Pattern Recognit..

[8]  Sankar K. Pal,et al.  Rough Set Based Generalized Fuzzy $C$ -Means Algorithm and Quantitative Indices , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  James C. Bezdek,et al.  A Convergence Theorem for the Fuzzy ISODATA Clustering Algorithms , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[11]  Min Chen,et al.  Rough Cluster Quality Index Based on Decision Theory , 2009, IEEE Transactions on Knowledge and Data Engineering.

[12]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[13]  B. Walter,et al.  Fast agglomerative clustering for rendering , 2008, 2008 IEEE Symposium on Interactive Ray Tracing.

[14]  Witold Pedrycz,et al.  Shadowed c-means: Integrating fuzzy and rough clustering , 2010, Pattern Recognit..