In this paper we present a novel clustering method that represents the hierarchy of data granularity using a dendrogram. Instead of using (dis-)similarity of objects, we use indiscernibility of objects as proximity. The indiscernibility represents the level of global agreement for classifying a pair of objects as indiscernible objects, and is calculated based on the binary classifications determined independently to each object. Then the simple nearest neighbor hierarchical clustering is used to construct a dendrogram of objects, which represents the hierarchy of indiscernibility. This scheme allows us to control the granularity of resultant object groups, by interactively selecting the threshold level of indiscernibility. The benefits of this method also include that the dissimilarity of objects for forming the binary classifications does not need to satisfy symmetry nor triangular inequality; thus it could be applied to various kind of datasets including relational data.
[1]
Shusaku Tsumoto,et al.
An Indiscernibility-Based Clustering Method with Iterative Refinement of Equivalence Relations -Rough Clustering-
,
2003,
Journal of Advanced Computational Intelligence and Intelligent Informatics.
[2]
Pavel Berkhin,et al.
A Survey of Clustering Data Mining Techniques
,
2006,
Grouping Multidimensional Data.
[3]
James C. Bezdek,et al.
Nerf c-means: Non-Euclidean relational fuzzy clustering
,
1994,
Pattern Recognit..
[4]
J. Neyman,et al.
Statistical Approach to Problems of Cosmology
,
1958
.
[5]
Janusz Zalewski,et al.
Rough sets: Theoretical aspects of reasoning about data
,
1996
.