Effective out-of-core parallel Delaunay mesh refinement using off-the-shelf software

We present two cost-effective and high-performance out-of-core parallel mesh generation algorithms and their implementation on cluster of workstations (CoWs). The total wall-clock time including wait-in-queue delays for the out-of-core methods on a small cluster (16 processors) is three times shorter than the total wall-clock time for the in-core generation of the same size mesh (about a billion elements) using 121 processors. Our best out-of-core method, for mesh sizes that fit completely in the core of the CoWs, is about 5% slower than its in-core parallel counterpart method. This is a modest performance penalty for savings of many hours in response time. Both the in-core and out-of-core methods use the best publicly available off-the-shelf sequential in-core Delaunay mesh generator

[1]  Jeffrey Scott Vitter,et al.  Algorithms for parallel memory, II: Hierarchical multilevel memories , 1992, Algorithmica.

[2]  Jack Dongarra,et al.  Prospectus for the Development of a Linear Algebra Library for High-Performance Computers , 1997 .

[3]  Frank Dehne,et al.  Efficient External Memory Algorithms by Simulating Coarse-Grained Parallel Algorithms , 1997, SPAA '97.

[4]  Jack J. Dongarra,et al.  An extended set of FORTRAN basic linear algebra subprograms , 1988, TOMS.

[5]  Jeffrey Scott Vitter,et al.  Large-Scale Sorting in Uniform Memory Hierarchies , 1993, J. Parallel Distributed Comput..

[6]  Jonathan Richard Shewchuk,et al.  Delaunay refinement algorithms for triangular mesh generation , 2002, Comput. Geom..

[7]  Guy E. Blelloch,et al.  Design and Implementation of a Practical Parallel Delaunay Algorithm , 1999, Algorithmica.

[8]  Andrey N. Chernikov,et al.  Three-dimensional delaunay refinement for multi-core processors , 2008, ICS '08.

[9]  Jeffrey Scott Vitter,et al.  Greed sort: optimal deterministic sorting on parallel disks , 1995, JACM.

[10]  Nikos Chrisochoides,et al.  Delaunay Decoupling Method for Parallel Guaranteed Quality Planar Mesh Refinement , 2005, SIAM J. Sci. Comput..

[11]  Jim Ruppert,et al.  A Delaunay Refinement Algorithm for Quality 2-Dimensional Mesh Generation , 1995, J. Algorithms.

[12]  Andrey N. Chernikov,et al.  Generalized Delaunay Mesh Refinement: From Scalar to Parallel , 2006, IMR.

[13]  Andrey N. Chernikov,et al.  Algorithm 872: Parallel 2D constrained Delaunay mesh generation , 2008, TOMS.

[14]  Guang R. Gao,et al.  Hybrid technology multithreaded architecture , 1996, Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96).

[15]  Jack Dongarra,et al.  ScaLAPACK: a scalable linear algebra library for distributed memory concurrent computers , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.

[16]  Nikos Chrisochoides,et al.  Graded Delaunay Decoupling Method for Parallel Guaranteed Quality Planar Mesh Generation , 2008, SIAM J. Sci. Comput..

[17]  Jonathan Richard Shewchuk,et al.  Triangle: Engineering a 2D Quality Mesh Generator and Delaunay Triangulator , 1996, WACG.

[18]  Sivan Toledo,et al.  The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations , 1996, IOPADS '96.

[19]  Alok Aggarwal,et al.  The input/output complexity of sorting and related problems , 1988, CACM.

[20]  Nikos Chrisochoides,et al.  Parallel Mesh Generation , 2006 .

[21]  Jeffrey Scott Vitter,et al.  Algorithms for parallel memory, I: Two-level memories , 2005, Algorithmica.

[22]  Nikos Chrisochoides,et al.  Algorithm 870: A static geometric Medial Axis domain decomposition in 2D Euclidean space , 2008, TOMS.

[23]  David R. O'Hallaron,et al.  A Computational Database System for Generatinn Unstructured Hexahedral Meshes with Billions of Elements , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[24]  Michael S. Warren,et al.  Parallel, Out-of-Core Methods for N-body Simulation , 1997, PPSC.

[25]  Nikos Chrisochoides,et al.  Guaranteed: quality parallel delaunay refinement for restricted polyhedral domains , 2002, SCG '02.

[26]  David R. O'Hallaron,et al.  Extracting Hexahedral Mesh Structures from Balanced Linear Octrees , 2004, IMR.

[27]  Keshav Pingali,et al.  A load balancing framework for adaptive and asynchronous applications , 2004, IEEE Transactions on Parallel and Distributed Systems.

[28]  Jack Dongarra,et al.  The Design and Implementation of the Parallel Out-of-coreScaLAPACK LU, QR, and Cholesky Factorization Routines , 1997 .

[29]  S. Dong,et al.  Flow past a stationary and moving cylinder: DNS at Re=10,000 , 2004, 2004 Users Group Conference (DOD_UGC'04).

[30]  George Em Karniadakis,et al.  Nodes, modes and flow codes , 1993 .

[31]  J. Shewchuk,et al.  Delaunay refinement mesh generation , 1997 .

[32]  Andrey N. Chernikov,et al.  Practical and efficient point insertion scheduling method for parallel guaranteed quality delaunay refinement , 2004, ICS '04.

[34]  Roy A. Walters,et al.  Coastal ocean models : two useful finite element methods , 2005 .

[35]  Andrey N. Chernikov,et al.  Parallel Guaranteed Quality Planar Delaunay Mesh Generation by Concurrent Point Insertion , 2004 .

[36]  Andrey N. Chernikov,et al.  Parallel Guaranteed Quality Delaunay Uniform Mesh Refinement , 2006, SIAM J. Sci. Comput..

[37]  Andrey N. Chernikov,et al.  Effective out-of-core parallel Delaunay mesh refinement using off-the-shelf software , 2006, IPDPS.