Clustering is a central unsupervised learning task with a wide variety of applications. Not surprisingly, there exist many clustering algorithms. However, unlike classification tasks, in clustering, different algorithms may yield dramatically different outputs for the same input sets. A major challenge is to develop tools that may help select the more suitable algorithm for a given clustering task. We propose to address this problem by distilling abstract properties of clustering functions that distinguish between the types of input-output behaviors of different clustering paradigms. In this paper we make a significant step in this direction by providing such property based characterization for the class of linkage based clustering algorithms. Linkage-based clustering is one the most commonly used and widely studied clustering paradigms. It includes popular algorithms like Single Linkage and enjoys simple efficient algorithms. On top of their potential merits for helping users decide when are such algorithms appropriate for their data, our results can be viewed as a convincing proof of concept for the research on taxonomizing clustering paradigms by their abstract properties.
[1]
Luís Torgo,et al.
Knowledge Discovery in Databases: PKDD 2005, 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, Porto, Portugal, October 3-7, 2005, Proceedings
,
2005,
PKDD.
[2]
Derek Greene,et al.
Ensemble non-negative matrix factorization methods for clustering protein-protein interactions
,
2008,
Bioinform..
[3]
Ulrike von Luxburg,et al.
A tutorial on spectral clustering
,
2007,
Stat. Comput..
[4]
Robin Sibson,et al.
The Construction of Hierarchic and Non-Hierarchic Classifications
,
1968,
Comput. J..
[5]
Brian Everitt,et al.
Cluster analysis
,
1974
.
[6]
Jon M. Kleinberg,et al.
An Impossibility Theorem for Clustering
,
2002,
NIPS.
[7]
Chris H. Q. Ding,et al.
Cluster Aggregate Inequality and Multi-level Hierarchical Clustering
,
2005,
PKDD.
[8]
Reza Bosagh Zadeh,et al.
A Uniqueness Theorem for Clustering
,
2009,
UAI.
[9]
Shai Ben-David,et al.
Measures of Clustering Quality: A Working Set of Axioms for Clustering
,
2008,
NIPS.