Exposing Fine-Grained Parallelism in Algebraic Multigrid Methods

Algebraic multigrid methods for large, sparse linear systems are a necessity in many computational simulations, yet parallel algorithms for such solvers are generally decomposed into coarse-grained...

[1]  Michael M. Kazhdan,et al.  Streaming multigrid for gradient-domain operations on large images , 2008, ACM Trans. Graph..

[2]  Robert Strzodka,et al.  Using GPUs to improve multigrid solver performance on a cluster , 2008, Int. J. Comput. Sci. Eng..

[3]  Jonathan Cohen,et al.  Title: A Fast Double Precision CFD Code using CUDA , 2009 .

[4]  Rajesh Bordawekar,et al.  Optimizing Sparse Matrix-Vector Multiplication on GPUs using Compile-time and Run-time Strategies , 2008 .

[5]  D. Bartuschat Algebraic Multigrid , 2007 .

[6]  Yao Zhang,et al.  Scan primitives for GPU computing , 2007, GH '07.

[7]  Randolph E. Bank,et al.  Sparse matrix multiplication package (SMMP) , 1993, Adv. Comput. Math..

[8]  Marc Olano,et al.  GPU random numbers via the tiny encryption algorithm , 2010, HPG '10.

[9]  Michael Garland,et al.  Understanding throughput-oriented architectures , 2010, Commun. ACM.

[10]  Guy E. Blelloch,et al.  Vector Models for Data-Parallel Computing , 1990 .

[11]  Andrew S. Grimshaw,et al.  Revisiting sorting for GPGPU stream architectures , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[12]  Ray S. Tuminaro,et al.  Parallel Smoothed Aggregation Multigrid : Aggregation Strategies on Massively Parallel Machines , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[13]  Li-Yi Wei,et al.  Parallel white noise generation on a GPU via cryptographic hash , 2008, I3D '08.

[14]  V. E. Henson,et al.  BoomerAMG: a parallel algebraic multigrid solver and preconditioner , 2002 .

[15]  Hyun Jin Moon,et al.  Fast Sparse Matrix-Vector Multiplication by Exploiting Variable Block Structure , 2005, HPCC.

[16]  Jonathan J. Hu,et al.  ML 5.0 Smoothed Aggregation Users's Guide , 2006 .

[17]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[18]  Ray S. Tuminarol Parallel Smoothed Aggregation Multigrid : Aggregation Strategies on Massively Parallel Machines , 2000 .

[19]  Helmar Burkhart,et al.  General-Purpose Sparse Matrix Building Blocks using the NVIDIA CUDA Technology Platform , 2007 .

[20]  Marian Brezina,et al.  Algebraic multigrid by smoothed aggregation for second and fourth order elliptic problems , 2005, Computing.

[21]  Michael Garland,et al.  Implementing sparse matrix-vector multiplication on throughput-oriented processors , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[22]  Jonathan J. Hu,et al.  ML 3.1 smoothed aggregation user's guide. , 2004 .

[23]  Rajesh Bordawekar,et al.  Optimizing Sparse Matrix-Vector Multiplication on GPUs , 2009 .

[24]  Martin Schulz,et al.  Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[25]  Andrew S. Grimshaw,et al.  Parallel Scan for Stream Architectures , 2012 .

[26]  Vipin Kumar,et al.  Parallel Multilevel series k-Way Partitioning Scheme for Irregular Graphs , 1999, SIAM Rev..

[27]  Samuel Williams,et al.  Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[28]  Charles L. Lawson,et al.  Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.

[29]  A. Grimshaw,et al.  Revisiting Sorting for GPGPU Stream Architectures 1 , 2010 .

[30]  Eduardo F. D'Azevedo,et al.  Vectorized Sparse Matrix Multiply for Compressed Row Storage Format , 2005, International Conference on Computational Science.

[31]  Vipin Kumar,et al.  Parallel Multilevel k-way Partitioning Scheme for Irregular Graphs , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[32]  Fred G. Gustavson,et al.  Two Fast Algorithms for Sparse Matrices: Multiplication and Permuted Transposition , 1978, TOMS.

[33]  J. W. Ruge,et al.  4. Algebraic Multigrid , 1987 .

[34]  Eitan Grinspun,et al.  Sparse matrix solvers on the GPU: conjugate gradients and multigrid , 2003, SIGGRAPH Courses.

[35]  Jacob B. Schroder,et al.  A new perspective on strength measures in algebraic multigrid , 2010, Numer. Linear Algebra Appl..

[36]  Jonathan J. Hu,et al.  Parallel multigrid smoothing: polynomial versus Gauss--Seidel , 2003 .

[37]  Van Emden Henson,et al.  Robustness and Scalability of Algebraic Multigrid , 1999, SIAM J. Sci. Comput..

[38]  Michael Luby,et al.  A simple parallel algorithm for the maximal independent set problem , 1985, STOC '85.

[39]  Greg Humphreys,et al.  A multigrid solver for boundary value problems using programmable graphics hardware , 2003, HWWS '03.

[40]  Michael Garland,et al.  Efficient Sparse Matrix-Vector Multiplication on CUDA , 2008 .

[41]  Edmond Chow,et al.  A Survey of Parallelization Techniques for Multigrid Solvers , 2006, Parallel Processing for Scientific Computing.