Parallel clustering algorithms for structured AMR

We compare several different parallel implementation approaches for the clustering operations performed during adaptive meshing operations in patch-based structured adaptive mesh refinement (SAMR) applications. Specifically, we target the clustering algorithm of Berger and Rigoutsos, which is commonly used in many SAMR applications. The baseline for comparison is a single program, multiple data extension of the original algorithm that works well for up to O(102) processors. Our goal is a clustering algorithm for machines of up to O(105) processors, such as the 64K-processor IBM BlueGene/L (BG/L) system. We first present an algorithm that avoids unneeded communications of the baseline approach, improving the clustering speed by up to an order of magnitude. We then present a new task-parallel implementation to further reduce communication wait time, adding another order of magnitude of improvement. The new algorithms exhibit more favorable scaling behavior for our test problems. Performance is evaluated on a number of large-scale parallel computer systems, including a 16K-processor BG/L system.

[1]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[2]  Scott B. Baden,et al.  Irregular Coarse-Grain Data Parallelism under LPARX , 1996, Sci. Program..

[3]  José D. P. Rolim Parallel and Distributed Processing: 15 IPDPS 2000 Workshops Cancun, Mexico, May 1–5, 2000 Proceedings , 2000, Lecture Notes in Computer Science.

[4]  Shahid H. Bokhari,et al.  A Partitioning Strategy for Nonuniform Problems on Multiprocessors , 1987, IEEE Transactions on Computers.

[5]  D Marr,et al.  Theory of edge detection , 1979, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[6]  Scott R. Kohn,et al.  Managing application complexity in the SAMRAI object‐oriented framework , 2002, Concurr. Comput. Pract. Exp..

[7]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[8]  M. Berger,et al.  Adaptive mesh refinement for hyperbolic partial differential equations , 1982 .

[9]  Richard D. Hornung,et al.  Enhancing scalability of parallel structured AMR calculations , 2003, ICS '03.

[10]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[11]  Isidore Rigoutsos,et al.  An algorithm for point clustering and grid generation , 1991, IEEE Trans. Syst. Man Cybern..

[12]  P. Colella,et al.  Local adaptive mesh refinement for shock hydrodynamics , 1989 .

[13]  Scott B. Baden,et al.  A parallel software infrastructure for dynamic block-irregular scientific calculations , 1995 .

[14]  Jarmo Rantakokko An Integrated Decomposition and Partitioning Approach for Irregular Block-Structured Applications , 2000, IPDPS Workshops.

[15]  John B. Bell,et al.  Parallelization of structured, hierarchical adaptive mesh refinement algorithms , 2000 .