Towards Textbook Efficiency for Parallel Multigrid

In this work, we extend Achi Brandt’s notion of textbook multigrid efficiency (TME) to massively parallel algorithms. Using a finite element based geometric multigrid implementation, we recall the classical view on TME with experiments for scalar linear equations with constant and varying coefficients as well as linear systems with saddle-point structure. To extend the idea of TME to the parallel setting, we give a new characterization of a work unit (WU) in an architecture-aware fashion by taking into account performance modeling techniques. We illustrate our newly introduced parallel TME measure by large-scale computations, solving problems with up to 200 billion unknowns on a TOP-10 supercomputer. AMS subject classifications: 65N55, 68W10

[1]  Xiaozhe Hu,et al.  Parallel Unsmoothed Aggregation Algebraic Multigrid Algorithms on GPUs , 2013 .

[2]  George Biros,et al.  A Parallel Geometric Multigrid Method for Finite Elements on Octree Meshes , 2010, SIAM J. Sci. Comput..

[3]  Jan-Philipp Weiss,et al.  Parallel Smoothers for Matrix-Based Geometric Multigrid Methods on Locally Refined Meshes Using Multicore CPUs and GPUs , 2011, Facing the Multicore-Challenge.

[4]  Manfred Liebmann,et al.  Algebraic Multigrid Solver on Clusters of CPUs and GPUs , 2010, PARA.

[5]  Ulrich Rüde,et al.  Parallel multigrid on hierarchical hybrid grids: a performance study on current high performance computing clusters , 2014, Concurr. Comput. Pract. Exp..

[6]  Gerhard Wellein,et al.  Exploring performance and power properties of modern multi‐core chips via simple machine models , 2012, Concurr. Comput. Pract. Exp..

[7]  Cyril Flaig,et al.  A scalable memory efficient multigrid solver for micro-finite element analyses based on CT images , 2011, Parallel Comput..

[8]  Daniel Ritter,et al.  A Geometric Multigrid Solver on GPU Clusters , 2013 .

[9]  Achi Brandt,et al.  Barriers to Achieving Textbook Multigrid Efficiency (TME) in CFD , 1998 .

[10]  Peter Bastian,et al.  A Massively Parallel Algebraic Multigrid Preconditioner based on Aggregation for Elliptic Problems with Heterogeneous Coefficients , 2012, ArXiv.

[11]  V. V. Shaidurov,et al.  Multigrid Methods for Finite Elements , 1995 .

[12]  Luke N. Olson,et al.  Exposing Fine-Grained Parallelism in Algebraic Multigrid Methods , 2012, SIAM J. Sci. Comput..

[13]  Ulrich Rüde,et al.  Optimization of the multigrid-convergence rate on semi-structured meshes by local Fourier analysis , 2013, Comput. Math. Appl..

[14]  P. Ghysels,et al.  MODELING THE PERFORMANCE OF GEOMETRIC MULTIGRID ON MANY-CORE COMPUTER ARCHITECTURES , 2013 .

[15]  Edmond Chow,et al.  A Survey of Parallelization Techniques for Multigrid Solvers , 2006, Parallel Processing for Scientific Computing.

[16]  Robert Strzodka,et al.  Exploring weak scalability for FEM calculations on a GPU-enhanced cluster , 2007, Parallel Comput..

[17]  Ulrich Rüde,et al.  Is 1.7 x 10^10 Unknowns the Largest Finite Element System that Can Be Solved Today? , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[18]  Irad Yavneh,et al.  On Red-Black SOR Smoothing in Multigrid , 1996, SIAM J. Sci. Comput..

[19]  Samuel Williams,et al.  Roofline: An Insightful Visual Performance Model for Floating-Point Programs and Multicore Architectures , 2008 .

[20]  F. Brezzi,et al.  On the Stabilization of Finite Element Approximations of the Stokes Equations , 1984 .

[21]  Benjamin Karl Bergen,et al.  Hierarchical hybrid grids: data structures and core algorithms for efficient finite element simulations on supercomputers = Hierarchische hybride Gitter , 2005 .

[22]  Barbara I. Wohlmuth,et al.  Performance and Scalability of Hierarchical Hybrid Multigrid Solvers for Stokes Systems , 2015, SIAM J. Sci. Comput..

[23]  J. Bey,et al.  Tetrahedral grid refinement , 1995, Computing.

[24]  Peter K. Jimack,et al.  Parallel Performance Prediction for Multigrid Codes on Distributed Memory Architectures , 2007, HPCC.

[25]  Robert Scheichl,et al.  Massively parallel solvers for elliptic partial differential equations in numerical weather and climate prediction , 2013, ArXiv.

[26]  Achi Brandt,et al.  Multigrid Techniques: 1984 Guide with Applications to Fluid Dynamics, Revised Edition , 2011 .

[27]  Georg Hager,et al.  Introducing a Performance Model for Bandwidth-Limited Loop Kernels , 2009, PPAM.

[28]  Ulrich Rüde,et al.  Parallel Geometric Multigrid , 2006 .

[29]  Gerhard Wellein,et al.  Introduction to High Performance Computing for Scientists and Engineers , 2010, Chapman and Hall / CRC computational science series.

[30]  Christian Wieners,et al.  A geometric data structure for parallel finite elements and the application to multigrid methods with block smoothing , 2010, Comput. Vis. Sci..

[31]  Benjamin Karl Bergen,et al.  Hierarchical hybrid grids: data structures and core algorithms for multigrid , 2004, Numer. Linear Algebra Appl..

[32]  Samuel Williams,et al.  Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[33]  Aslak Tveito,et al.  Numerical Solution of Partial Differential Equations on Parallel Computers (Lecture Notes in Computational Science and Engineering) , 2006 .

[34]  R. Verfürth A combined conjugate gradient - multi-grid algorithm for the numerical solution of the Stokes problem , 1984 .

[35]  Mark F. Adams,et al.  Ultrascalable Implicit Finite Element Analyses in Solid Mechanics with over a Half a Billion Degrees of Freedom , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[36]  Hari Sundar,et al.  Parallel geometric-algebraic multigrid on unstructured forests of octrees , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[37]  Ulrich Rüde,et al.  A Massively Parallel Multigrid Method for Finite Elements , 2006, Computing in Science & Engineering.