Comparison of Basic Clustering Algorithms
暂无分享,去创建一个
This paper presents the results of the theoretical study of some common document clustering techniques. Clustering is a machine learning technique for data mining which is a grouping of similar data for analysis purpose in simple words. We have compared the two main approaches of document clustering that are hierarchical clustering and Partitional clustering algorithm. We have surveyed and listed the algorithms, its advantages and disadvantages as well. Hierarchical clustering and its two basic approaches are discussed which are Agglomerative and Divisive. In partitional clustering, various partitions are generated by the partitioning algorithms like K-Means. However K-Means algorithm is very different from the hierarchical algorithms. Both of the approaches are better depending on the different situations. Partitional clustering is faster than the hierarchical clustering and partitional clustering is based on the stronger assumptions. In contradiction, hierarchical algorithm needs only a similarity measure and does not require input to be given. Keywords— Document clustering, Clustering algorithms, K-means algorithm, Hierarchical algorithm, Partitional algorithm
[1] Reynaldo Gil-García,et al. Dynamic hierarchical algorithms for document clustering , 2010, Pattern Recognit. Lett..