论文信息 - Parallel k/h-Means Clustering for Large Data Sets

Parallel k/h-Means Clustering for Large Data Sets

This paper describes the realization of a parallel version of the k/h-means clustering algorithm. This is one of the basic algorithms used in a wide range of data mining tasks. We show how a database can be distributed and how the algorithm can be applied to this distributed database. The tests conducted on a network of 32 PCs showed for large data sets a nearly ideal speedup.

Kilian Stoffel | Abdelkader Belkoniene | K. Stoffel | Abdelkader Belkoniene

[1] Edie M. Rasmussen,et al. Efficiency of Hierarchic Agglomerative Clustering using the ICL Distributed array Processor , 1989, J. Documentation.

[2] Michael R. Anderberg,et al. Cluster Analysis for Applications , 1973 .

[3] John A. Hartigan,et al. Clustering Algorithms , 1975 .

[4] Jan M. Zytkow,et al. Knowledge Discovery in Database Terminology , 1996, Advances in Knowledge Discovery and Data Mining.

[5] M F Janowitz. Cluster Analysis Algorithms for Image Segmentation. , 1981 .

[6] Clark F. Olson,et al. Parallel Algorithms for Hierarchical Clustering , 1995, Parallel Comput..

[7] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .

[8] Vincent Kanade,et al. Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.