A novel and scalable Multigrid algorithm for many-core architectures

Multigrid algorithms are among the fastest iterative methods known today for solving large linear and some non-linear systems of equations. Greatly optimized for serial operation, they still have a great potential for parallelism not fully realized. In this work, we present a novel multigrid algorithm designed to work entirely inside many-core architectures like the graphics processing units (GPUs), without memory transfers between the GPU and the central processing unit (CPU), avoiding low bandwitdth communications. The algorithm is denoted as the high occupancy multigrid (HOMG) because it makes use of entire grid operations with interpolations and relaxations fused into one task, providing useful work for every thread in the grid. For a given accuracy, its number of operations scale linearly with the total number of nodes in the grid. Perfect scalability is observed for a large number of processors.

[1]  Martin Rumpf,et al.  Image Registration by a Regularized Gradient Flow. A Streaming Implementation in DX9 Graphics Hardware , 2004, Computing.

[2]  Thomas Sangild Sørensen,et al.  An Introduction to GPU Accelerated Surgical Simulation , 2006, ISBMS.

[3]  Wolfgang Hackbusch,et al.  Multi-grid methods and applications , 1985, Springer series in computational mathematics.

[4]  Greg Humphreys,et al.  A multigrid solver for boundary value problems using programmable graphics hardware , 2003, HWWS '03.

[5]  S. McCormick,et al.  A multigrid tutorial (2nd ed.) , 2000 .

[6]  Michael Kazhdan,et al.  Streaming multigrid for gradient-domain operations on large images , 2008, SIGGRAPH 2008.

[7]  Leo Grady,et al.  A Lattice-Preserving Multigrid Method for Solving the Inhomogeneous Poisson Equations Used in Image Analysis , 2008, ECCV.

[8]  Manfred Liebmann,et al.  A Parallel Algebraic Multigrid Solver on Graphics Processing Units , 2009, HPCA.

[9]  Robert Strzodka,et al.  Cyclic Reduction Tridiagonal Solvers on GPUs Applied to Mixed-Precision Multigrid , 2011, IEEE Transactions on Parallel and Distributed Systems.

[10]  O. Scherzer,et al.  GPGPU-based Multigrid Methods , 2007 .

[11]  Robert Strzodka,et al.  Exploring weak scalability for FEM calculations on a GPU-enhanced cluster , 2007, Parallel Comput..

[12]  Zhuo Feng,et al.  Multigrid on GPU: Tackling Power Grid Analysis on parallel SIMT platforms , 2008, 2008 IEEE/ACM International Conference on Computer-Aided Design.

[13]  Peter Thoman,et al.  GPU-Based Multigrid: Real-Time Performance in High Resolution Nonlinear Image Processing , 2008, ICVS.

[14]  Gallagher Pryor,et al.  3D nonrigid registration via optimal mass transport on the GPU , 2009, Medical Image Anal..

[15]  Eric Darve,et al.  Large calculation of the flow over a hypersonic vehicle using a GPU , 2008, J. Comput. Phys..

[16]  Matemática,et al.  Society for Industrial and Applied Mathematics , 2010 .

[17]  Robert Strzodka,et al.  Using GPUs to improve multigrid solver performance on a cluster , 2008, Int. J. Comput. Sci. Eng..