Multilevel Balancing Domain Decomposition at Extreme Scales

In this paper we present a fully distributed, communicator-aware, recursive, and interlevel-overlapped message-passing implementation of the multilevel balancing domain decomposition by constraints (MLBDDC) preconditioner. The implementation highly relies on subcommunicators in order to achieve the desired effect of coarse-grain overlapping of computation and communication, and communication and communication among levels in the hierarchy (namely, interlevel overlapping). Essentially, the main communicator is split into as many nonoverlapping subsets of message-passing interface (MPI) tasks (i.e., MPI subcommunicators) as levels in the hierarchy. Provided that specialized resources (cores and memory) are devoted to each level, a careful rescheduling and mapping of all the computations and communications in the algorithm lets a high degree of overlapping be exploited among levels. All subroutines and associated data structures are expressed recursively, and therefore MLBDDC preconditioners with an arbitrar...

[1]  Jennifer A. Scott,et al.  Design of a Multicore Sparse Cholesky Factorization Using DAGs , 2010, SIAM J. Sci. Comput..

[2]  Andrea Toselli,et al.  Domain decomposition methods : algorithms and theory , 2005 .

[3]  Santiago Badia,et al.  Enhanced balancing Neumann–Neumann preconditioning in computational fluid and solid mechanics , 2013 .

[4]  Paul T. Lin,et al.  Performance of a parallel algebraic multilevel preconditioner for stabilized finite element semiconductor device modeling , 2009, J. Comput. Phys..

[5]  Paul Gibbon,et al.  High-Q Club – The highest scaling Codes on JUQUEEN , 2013 .

[6]  Patrick R. Amestoy,et al.  Multifrontal parallel distributed symmetric and unsymmetric solvers , 2000 .

[7]  J. Mandel,et al.  An algebraic theory for primal and dual substructuring methods by constraints , 2005 .

[8]  Clark R. Dohrmann,et al.  Convergence of a balancing domain decomposition by constraints and energy minimization , 2002, Numer. Linear Algebra Appl..

[9]  George Karypis,et al.  A Software Package for Partitioning Unstructured Graphs , Partitioning Meshes , and Computing Fill-Reducing Orderings of Sparse Matrices Version 5 . 0 , 1998 .

[10]  J. Pasciak,et al.  Parallel multilevel preconditioners , 1990 .

[11]  Xuemin Tu Three-Level BDDC in Three Dimensions , 2007, SIAM J. Sci. Comput..

[12]  L. R. Scott,et al.  The Mathematical Theory of Finite Element Methods , 1994 .

[13]  O. Schenk,et al.  ON FAST FACTORIZATION PIVOTING METHODS FOR SPARSE SYMMETRI C INDEFINITE SYSTEMS , 2006 .

[14]  Olof B. Widlund,et al.  A Domain Decomposition Method with Lagrange Multipliers and Inexact Solvers for Linear Elasticity , 2000, SIAM J. Sci. Comput..

[15]  A. Klawonn,et al.  Highly scalable parallel domain decomposition methods with an application to biomechanics , 2010 .

[16]  Michael Gschwind,et al.  The IBM Blue Gene/Q Compute Chip , 2012, IEEE Micro.

[17]  Panayot S. Vassilevski,et al.  Reducing communication in algebraic multigrid using additive variants , 2014, Numer. Linear Algebra Appl..

[18]  Santiago Badia,et al.  On the scalability of inexact balancing domain decomposition by constraints with overlapped coarse/fine corrections , 2015, Parallel Comput..

[19]  Vaclav Hapla,et al.  Use of Direct Solvers in TFETI Massively Parallel Implementation , 2012, PARA.

[20]  Clark R. Dohrmann,et al.  An approximate BDDC preconditioner , 2007, Numer. Linear Algebra Appl..

[21]  Santiago Badia,et al.  A Highly Scalable Parallel Implementation of Balancing Domain Decomposition by Constraints , 2014, SIAM J. Sci. Comput..

[22]  Gabriel Wittum,et al.  Additive and multiplicative multi-grid — A comparison , 1998, Computing.

[23]  Oliver Rheinbach,et al.  Parallel Iterative Substructuring in Structural Mechanics , 2009 .

[24]  J. Mandel Balancing domain decomposition , 1993 .

[25]  Yousef Saad,et al.  Iterative methods for sparse linear systems , 2003 .

[26]  Timothy A. Davis,et al.  Direct methods for sparse linear systems , 2006, Fundamentals of algorithms.

[27]  CLARK R. DOHRMANN,et al.  A Preconditioner for Substructuring Based on Constrained Energy Minimization , 2003, SIAM J. Sci. Comput..

[28]  Olof B. Widlund,et al.  DUAL-PRIMAL FETI METHODS FOR THREE-DIMENSIONAL ELLIPTIC PROBLEMS WITH HETEROGENEOUS COEFFICIENTS , 2022 .

[29]  Jan Mandel,et al.  Adaptive-Multilevel BDDC and its parallel implementation , 2013, Computing.

[30]  Clark R. Dohrmann,et al.  Multispace and multilevel BDDC , 2007, Computing.

[31]  Artem Napov,et al.  A massively parallel solver for discrete Poisson-like problems , 2015, J. Comput. Phys..

[32]  C. Farhat,et al.  The second generation FETI methods and their application to the parallel solution of large-scale linear and geometrically non-linear structural analysis problems , 2000 .

[33]  Martin Schulz,et al.  Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[34]  Santiago Badia,et al.  Implementation and Scalability Analysis of Balancing Domain Decomposition Methods , 2013 .

[35]  K. Stüben A review of algebraic multigrid , 2001 .

[36]  Pavel Burda,et al.  Face-based selection of corners in 3D substructuring , 2009, Math. Comput. Simul..