Dendro: parallel algorithms for multigrid and AMR methods on 2:1 balanced octrees

In this article, we present Dendro, a suite of parallel algorithms for the discretization and solution of partial differential equations (PDEs) involving second-order elliptic operators. Dendro uses trilinear finite element discretizations constructed using octrees. Dendro, comprises four main modules: a bottom-up octree generation and 2:1 balancing module, a meshing module, a geometric multiplicative multigrid module, and a module for adaptive mesh refinement (AMR). Here, we focus on the multigrid and AMR modules. The key features of Dendro are coarsening/refinement, inter-octree transfers of scalar and vector fields, and parallel partition of multilevel octree forests. We describe a bottom-up algorithm for constructing the coarser multigrid levels. The input is an arbitrary 2:1 balanced octree-based mesh, representing the fine level mesh. The output is a set of octrees and meshes that are used in the multigrid sweeps. Also, we describe matrix-free implementations for the discretized PDE operators and the intergrid transfer operations. We present results on up to 4096 CPUs on the Cray XT3 (ldquoBigBenrdquo), the Intel 64 system (ldquoAberdquo), and the Sun Constellation Linux cluster (ldquoRangerrdquo).

[1]  Miriam Mehl,et al.  A Cache-Aware Algorithm for PDEs on Hierarchical Data Structures Based on Space-Filling Curves , 2006, SIAM J. Sci. Comput..

[2]  David R. O'Hallaron,et al.  Remote runtime steering of integrated terascale simulation and visualization , 2006, SC.

[3]  W. K. Anderson,et al.  Achieving High Sustained Performance in an Unstructured Mesh CFD Application , 1999, ACM/IEEE SC 1999 Conference (SC'99).

[4]  V. E. Henson,et al.  BoomerAMG: a parallel algebraic multigrid solver and preconditioner , 2002 .

[5]  Miriam Mehl,et al.  A cache‐oblivious self‐adaptive full multigrid method , 2006, Numer. Linear Algebra Appl..

[6]  Edmond Chow,et al.  A Survey of Parallelization Techniques for Multigrid Solvers , 2006, Parallel Processing for Scientific Computing.

[7]  Weigang Wang,et al.  Special Bilinear Quadrilateral Elements For Locally Refined Finite Element Grids , 2000, SIAM J. Sci. Comput..

[8]  William Gropp,et al.  Performance Modeling and Tuning of an Unstructured Mesh CFD Application , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[9]  Robert D. Falgout,et al.  The Design and Implementation of hypre, a Library of Parallel High Performance Preconditioners , 2006 .

[10]  S. Popinet Gerris: a tree-based adaptive solver for the incompressible Euler equations in complex geometries , 2003 .

[11]  James Demmel,et al.  SuperLU_DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems , 2003, TOMS.

[12]  David Eppstein,et al.  Parallel Construction of Quadtrees and Quality Triangulations , 1993, Int. J. Comput. Geom. Appl..

[13]  Roland Becker,et al.  Multigrid techniques for finite elements on locally refined meshes , 2000 .

[14]  Deborah Greaves,et al.  Hierarchical tree-based finite element mesh generation , 1999 .

[15]  Sally A. McKee,et al.  Improving the computational intensity of unstructured mesh applications , 2005, ICS '05.

[16]  Ulrich Rüde,et al.  Parallel Geometric Multigrid , 2006 .

[17]  Peter K. Jimack,et al.  An adaptive multigrid tool for elliptic and parabolic systems , 2005 .

[18]  Mark F. Adams,et al.  Ultrascalable Implicit Finite Element Analyses in Solid Mechanics with over a Half a Billion Degrees of Freedom , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[19]  W. Deen Analysis Of Transport Phenomena , 1998 .

[20]  Hari Sundar,et al.  Bottom-Up Construction and 2: 1 Balance Refinement of Linear Octrees in Parallel , 2008, SIAM J. Sci. Comput..

[21]  George Karypis,et al.  Introduction to Parallel Computing , 1994 .

[22]  Hans-Joachim Bungartz,et al.  A Parallel Adaptive Cartesian PDE Solver Using Space-Filling Curves , 2006, Euro-Par.

[23]  R.D. Falgout,et al.  An Introduction to Algebraic Multigrid Computing , 2006, Computing in Science & Engineering.

[24]  Ulrich Rüde,et al.  Is 1.7 x 10^10 Unknowns the Largest Finite Element System that Can Be Solved Today? , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[25]  Dimitri J. Mavriplis,et al.  High Resolution Aerospace Applications using the NASA Columbia Supercomputer , 2005 .

[26]  Leszek Demkowicz,et al.  Toward a universal h-p adaptive finite element strategy , 1989 .

[27]  Michael Griebel,et al.  Parallel Multigrid in an Adaptive PDE Solver Based on Hashing , 1998 .

[28]  David R. O'Hallaron,et al.  Scalable Parallel Octree Meshing for TeraScale Applications , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[29]  Christos Davatzikos,et al.  Low-constant parallel algorithms for finite element simulations using linear octrees , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[30]  M. Gurtin,et al.  An introduction to continuum mechanics , 1981 .

[31]  P. Colella,et al.  SCALLOP: A Highly Scalable Parallel Poisson Solver in Three Dimensions , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[32]  Miriam Mehl,et al.  Cache-Optimal Data-Structures for Hierarchical Methods on Adaptively Refined Space-Partitioning Grids , 2006, HPCC.

[33]  Tiankai Tu,et al.  High Resolution Forward And Inverse Earthquake Modeling on Terascale Computers , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[34]  T. Tu,et al.  From Mesh Generation to Scientific Visualization: An End-to-End Approach to Parallel Supercomputing , 2006, ACM/IEEE SC 2006 Conference (SC'06).