A parallel hierarchical clustering algorithm based on PRAM model

An adaptive parallel algorithm for hierarchical clustering based on PRAM model was presented. Performing the data preprocessing depended on “90-10” rule to decrease the numbers of data set, performing the parallel algorithm for creating Euclid Minimum Spanning Trees on absolute graph, performing the algorithm for finding the disjoining strategies and non-collision memory, data set was clustered optimizedly. Data set was clustered on the conditions of non-collision memory, lowest-cost and weakest PRAM-EREW model. N data sets were clustered in O((λn)2/p) time (0.1≤λ≤0.3) performing this algorithm on p processors (1≤p≤n/log(n)). The parallel clustering algorithm based on PRAM model is an adaptive non-collision memory parallel hierarchical clustering algorithm. The calculating time will be greatly reduced after original inputing data are effectually preprocessed through improved preprocessing methods of this thesis.

[1]  Bernard Chazelle,et al.  A minimum spanning tree algorithm with inverse-Ackermann type complexity , 2000, JACM.

[2]  Shi-Jinn Horng,et al.  Efficient Parallel Algorithms for Hierarchical Clustering on Arrays with Reconfigurable Optical Buses , 2000, J. Parallel Distributed Comput..

[3]  Peter Scheuermann,et al.  pPOP: Fast yet accurate parallel hierarchical clustering using partitioning , 2007, Data Knowl. Eng..

[4]  Elias Dahlhaus,et al.  Parallel Algorithms for Hierarchical Clustering and Applications to Split Decomposition and Parity Graph Recognition , 2000, J. Algorithms.

[5]  Jong Won Park Multiaccess Memory System for Attached SIMD Computer , 2004, IEEE Trans. Computers.

[6]  David J. Buttler,et al.  Encyclopedia of Data Warehousing and Mining Second Edition , 2008 .

[7]  Selim G. Akl,et al.  An adaptive and cost-optimal parallel algorithm for minimum spanning trees , 1986, Computing.

[8]  Yijie Han,et al.  Concurrent threads and optimal parallel minimum spanning trees algorithm , 2001, JACM.

[9]  Sanguthevar Rajasekaran Efficient parallel hierarchical clustering algorithms , 2005, IEEE Transactions on Parallel and Distributed Systems.

[10]  Xiaobo Li,et al.  Parallel Algorithms for Hierarchical Clustering and Cluster Validity , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Edie M. Rasmussen,et al.  Efficiency of Hierarchic Agglomerative Clustering using the ICL Distributed array Processor , 1989, J. Documentation.

[12]  Xiaobo Li,et al.  Parallel clustering algorithms , 1989, Parallel Comput..

[13]  Bing Zhou,et al.  A parallel hierarchical clustering algorithm for PCs cluster system , 2007, Neurocomputing.

[14]  Selim G. Akl,et al.  Optimal Parallel Merging and Sorting Without Memory Conflicts , 1987, IEEE Transactions on Computers.

[15]  Pierre Gançarski,et al.  Exploitation of a parallel clustering algorithm on commodity hardware with P2P-MPI , 2007, The Journal of Supercomputing.

[16]  Clark F. Olson,et al.  Parallel Algorithms for Hierarchical Clustering , 1995, Parallel Comput..