An Efficient Hierarchical Clustering Algorithm via Root Searching

As an important branch of machine learning, clustering is wildly used for data analysis in various domains. Hierarchical clustering algorithm, one of the traditional clustering algorithms, has excellent stability yet relatively poor time complexity. In this paper, we proposed an efficient hierarchical clustering algorithm by searching given nodes' nearest neighbors iteratively, which depends on an assumption: the representative node (root) may exist in the densest data area. The experiments results preformed on 14 UCI datasets show that our algorithm exhibits the best accuracies on most datasets. Moreover, our method has a linear time complexity which is significantly better than other traditional clustering methods like UPGMA and K-Means.

[1]  Wei Tang,et al.  Clusterer ensemble , 2006, Knowl. Based Syst..

[2]  Jill P. Mesirov,et al.  Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data , 2003, Machine Learning.

[3]  Ana L. N. Fred,et al.  Combining multiple clusterings using evidence accumulation , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Sudipto Guha,et al.  ROCK: a robust clustering algorithm for categorical attributes , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[5]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[6]  Ludmila I. Kuncheva,et al.  Experimental Comparison of Cluster Ensemble Methods , 2006, 2006 9th International Conference on Information Fusion.

[7]  Anil K. Jain,et al.  Clustering ensembles: models of consensus and weak partitions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  K JainAnil,et al.  Combining Multiple Clusterings Using Evidence Accumulation , 2005 .

[9]  Joachim M. Buhmann,et al.  Complexity Optimized Data Clustering by Competitive Neural Networks , 1993, Neural Computation.

[10]  Chinatsu Aone,et al.  Fast and effective text mining using linear-time document clustering , 1999, KDD '99.

[11]  Ana L. N. Fred,et al.  Robust data clustering , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[12]  Vipin Kumar,et al.  Chameleon: Hierarchical Clustering Using Dynamic Modeling , 1999, Computer.