Efficient Two-Level Preconditioned Conjugate Gradient Method on the GPU

We present an implementation of Two-Level Preconditioned Conjugate Gradient Method for the GPU. We investigate a Truncated Neumann Series based preconditioner in combination with deflation and compare it with Block Incomplete Cholesky schemes. This combination exhibits fine-grain parallelism and hence we gain considerably in execution time. It’s numerical performance is also comparable to the Block Incomplete Cholesky approach. Our method provides a speedup of up to 16 times for a system of one million unknowns when compared to an optimized implementation on the CPU.

[1]  Michael Griebel,et al.  A multi-GPU accelerated solver for the three-dimensional two-phase incompressible Navier-Stokes equations , 2010, Computer Science - Research and Development.

[2]  P. Wesseling,et al.  A mass‐conserving Level‐Set method for modelling of multi‐phase flows , 2005 .

[3]  J. Demmel,et al.  Avoiding Communication in Computing Krylov Subspaces , 2007 .

[4]  J. Meijerink,et al.  An Efficient Preconditioned CG Method for the Solution of a Class of Layered Problems with Extreme Contrasts in the Coefficients , 1999 .

[5]  J. Sethian,et al.  Fronts propagating with curvature-dependent speed: algorithms based on Hamilton-Jacobi formulations , 1988 .

[6]  Inanc Senocak,et al.  A Full-Depth Amalgamated Parallel 3D Geometric Multigrid Solver for GPU Clusters , 2011 .

[7]  Arutyun Avetisyan,et al.  Implementing Blocked Sparse Matrix-Vector Multiplication on NVIDIA GPUs , 2009, SAMOS.

[8]  Manish Parashar,et al.  Solving Sparse Linear Systems on NVIDIA Tesla GPUs , 2009, ICCS.

[9]  Michael Garland,et al.  Efficient Sparse Matrix-Vector Multiplication on CUDA , 2008 .

[10]  Guillaume Caumon,et al.  Concurrent number cruncher: a GPU implementation of a general sparse linear solver , 2009, Int. J. Parallel Emergent Distributed Syst..

[11]  Yousef Saad,et al.  Iterative methods for sparse linear systems , 2003 .

[12]  Julien Langou,et al.  Accelerating scientific computations with mixed precision algorithms , 2008, Comput. Phys. Commun..

[13]  Cornelis Vuik,et al.  EFFICIENT DEFLATION METHODS APPLIED TO 3-D BUBBLY FLOW PROBLEMS , 2007 .

[14]  Jonas Koko,et al.  Parallel preconditioned conjugate gradient algorithm on GPU , 2012, J. Comput. Appl. Math..

[15]  J. Meijerink,et al.  An iterative solution method for linear systems of which the coefficient matrix is a symmetric -matrix , 1977 .

[16]  M. B. Van Gijzen,et al.  Comparison of the deflated preconditioned conjugate gradient method and algebraic multigrid for composite materials , 2012 .

[17]  J. M. Tang Two-level preconditioned conjugate gradient methods with applications to bubbly flow problems , 2008 .

[18]  Wolfgang Straßer,et al.  A Parallel Preconditioned Conjugate Gradient Solver for the Poisson Problem on a Multi-GPU Platform , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.

[19]  Cornelis Vuik,et al.  A mass conserving level set (MCLS) method for modeling of multi-phase flows , 2003 .

[20]  R. Gupta Implementation of the Deflated Preconditioned Conjugate Gradient Method for Bubbly Flow on the Graphical Processing Unit (GPU) , 2010 .

[21]  C. Vuik,et al.  New Variants of Deation Techniques for Pressure Correction in Bubbly Flow Problems 1 , 2007 .

[22]  Rajesh Bordawekar,et al.  Optimizing Sparse Matrix-Vector Multiplication on GPUs using Compile-time and Run-time Strategies , 2008 .