A Geometric Multigrid Solver on GPU Clusters

Recently, more and more GPU HPC clusters are installed and thus there is a need to adapt existing software design concepts to multi-GPU environments. We have developed a modular and easily extensible software framework called WaLBerla that covers a wide range of applications ranging from particulate flows over free surface flows to nano fluids coupled with temperature simulations. In this article we report on our experiences to extend WaLBerla in order to support geometric multigrid algorithms for the numerical solution of partial differential equations (PDEs) on multi-GPU clusters. We discuss the object-oriented software and performance engineering concepts necessary to integrate efficient compute kernels into our WaLBerla framework and show that a large fraction of the high computational performance offered by current heterogeneous HPC clusters can be sustained for geometric multigrid algorithms.

[1]  S. McCormick,et al.  A multigrid tutorial (2nd ed.) , 2000 .

[2]  Ulrich Rüde,et al.  Lehrstuhl Für Informatik 10 (systemsimulation) Walberla: Hpc Software Design for Computational Engineering Simulations Walberla: Hpc Software Design for Computational Engineering Simulations , 2010 .

[3]  Ulrich Rüde,et al.  Coupling multibody dynamics and computational fluid dynamics on 8192 processor cores , 2010, Parallel Comput..

[4]  Manfred Liebmann,et al.  A Parallel Algebraic Multigrid Solver on Graphics Processing Units , 2009, HPCA.

[5]  Ulf D. Schiller,et al.  Statistical mechanics of the fluctuating lattice Boltzmann equation. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  Zhangxin Chen,et al.  High Performance Computing and Applications , 2010, Lecture Notes in Computer Science.

[7]  Ulrich Rüde,et al.  A flexible Patch-based lattice Boltzmann parallelization approach for heterogeneous GPU-CPU clusters , 2010, Parallel Comput..

[8]  Ulrich Rüde,et al.  Localized Parallel Algorithm for Bubble Coalescence in Free Surface Lattice-Boltzmann Method , 2009, Euro-Par.

[9]  Wolfgang Hackbusch,et al.  Multi-grid methods and applications , 1985, Springer series in computational mathematics.

[10]  Nicolas Pinto,et al.  PyCUDA: GPU Run-Time Code Generation for High-Performance Computing , 2009, ArXiv.

[11]  Ulrich Rüde,et al.  Cache Optimization for Structured and Unstructured Grid Multigrid , 2000 .

[12]  Robert Strzodka,et al.  Using GPUs to improve multigrid solver performance on a cluster , 2008, Int. J. Comput. Sci. Eng..

[13]  Ulrich Rüde,et al.  Parallel Geometric Multigrid , 2006 .

[14]  William L. Briggs,et al.  A multigrid tutorial, Second Edition , 2000 .

[15]  D. Brandt,et al.  Multi-level adaptive solutions to boundary-value problems math comptr , 1977 .

[16]  Hiroki Honda,et al.  OMPCUDA : OpenMP Execution Framework for CUDA Based on Omni OpenMP Compiler , 2010, IWOMP.

[17]  Ulrich Rüde,et al.  Challenges and Potentials of Emerging Multicore Architectures , 2009 .