A Parallel Algebraic Multigrid Solver on Graphics Processing Units

The paper presents a multi-GPU implementation of the preconditioned conjugate gradient algorithm with an algebraic multigrid preconditioner (PCG-AMG) for an elliptic model problem on a 3D unstructured grid. An efficient parallel sparse matrix-vector multiplication scheme underlying the PCG-AMG algorithm is presented for the many-core GPU architecture. A performance comparison of the parallel solver shows that a singe Nvidia Tesla C1060 GPU board delivers the performance of a sixteen node Infiniband cluster and a multi-GPU configuration with eight GPUs is about 100 times faster than a typical server CPU core.

[1]  J. Ruge,et al.  Efficient solution of finite difference and finite element equations by algebraic multigrid (AMG) , 1984 .

[2]  Rajesh Bordawekar,et al.  Optimizing Sparse Matrix-Vector Multiplication on GPUs using Compile-time and Run-time Strategies , 2008 .

[3]  William L. Briggs,et al.  A multigrid tutorial , 1987 .

[4]  William Gropp,et al.  Skjellum using mpi: portable parallel programming with the message-passing interface , 1994 .

[5]  Rodrigo Weber dos Santos,et al.  Algebraic Multigrid Preconditioner for the Cardiac Bidomain Model , 2007, IEEE Transactions on Biomedical Engineering.

[6]  Richard Barrett,et al.  Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods , 1994, Other Titles in Applied Mathematics.

[7]  P. Deuflhard,et al.  The cascadic multigrid method for elliptic problems , 1996 .

[8]  Richard W. Vuduc,et al.  Sparsity: Optimization Framework for Sparse Matrix Kernels , 2004, Int. J. High Perform. Comput. Appl..

[9]  Peter Thoman Multigrid Methods on GPUs , 2008 .

[10]  GUNDOLF HAASE,et al.  Parallel Algebraic Multigrid Methods on Distributed Memory Computers , 2002, SIAM J. Sci. Comput..

[11]  William L. Briggs,et al.  A multigrid tutorial, Second Edition , 2000 .

[12]  Gundolf Haase,et al.  Parallel AMG on Distributed MemoryComputers 1 , 2000 .

[13]  Robert Strzodka,et al.  Using GPUs to improve multigrid solver performance on a cluster , 2008, Int. J. Comput. Sci. Eng..

[14]  Craig C. Douglas,et al.  A Tutorial on Elliptic Pde Solvers and Their Parallelization , 2003 .

[15]  Rajesh Bordawekar,et al.  Optimizing Sparse Matrix-Vector Multiplication on GPUs , 2009 .

[16]  P. Vassilevski Multilevel Block Factorization Preconditioners: Matrix-based Analysis and Algorithms for Solving Finite Element Equations , 2008 .

[17]  Craig C. Douglas,et al.  Madpack: A Family of Abstract Multigrid or Multilevel Solvers , 1995 .

[18]  Michael Garland,et al.  Efficient Sparse Matrix-Vector Multiplication on CUDA , 2008 .

[19]  S. McCormick,et al.  A multigrid tutorial (2nd ed.) , 2000 .

[20]  D. J. Paddon,et al.  Multigrid Methods for Integral and Differential Equations. , 1987 .