Supernodal sparse Cholesky factorization on graphics processing units

Sparse Cholesky factorization is the most computationally intensive component in solving large sparse linear systems and is the core algorithm of numerous scientific computing applications. A large number of sparse Cholesky factorization algorithms have previously emerged, exploiting architectural features for various computing platforms. The recent use of graphics processing units (GPUs) to accelerate structured parallel applications shows the potential to achieve significant acceleration relative to desktop performance. However, sparse Cholesky factorization has not been explored sufficiently because of the complexity involved in its efficient implementation and the concerns of low GPU utilization.

[1]  Joseph W. H. Liu,et al.  The Multifrontal Method for Sparse Matrix Solution: Theory and Practice , 1992, SIAM Rev..

[2]  Martin Berggren,et al.  Hybrid differentiation strategies for simulation and analysis of applications in C++ , 2008, TOMS.

[3]  Anamitra R. Choudhury,et al.  Multifrontal Factorization of Sparse SPD Matrices on GPUs , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[4]  Joseph W. H. Liu,et al.  On Finding Supernodes for Sparse Matrix Computations , 1993, SIAM J. Matrix Anal. Appl..

[5]  Sergio Pissanetzky,et al.  Sparse Matrix Technology , 1984 .

[6]  Jack J. Dongarra,et al.  A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[7]  Roger Grimes,et al.  Multifrontal Computations on GPUs and Their Multi-core Hosts , 2010, VECPAR.

[8]  Ümit V. Çatalyürek,et al.  An Early Evaluation of the Scalability of Graph Algorithms on the Intel MIC Architecture , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.

[9]  Jack Dongarra,et al.  Sparse direct solvers with accelerators over DAG runtimes , 2012 .

[10]  Jennifer A. Scott,et al.  Design of a Multicore Sparse Cholesky Factorization Using DAGs , 2010, SIAM J. Sci. Comput..

[11]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[12]  YANQING CHEN,et al.  Algorithm 8 xx : CHOLMOD , supernodal sparse Cholesky factorization and update / downdate ∗ , 2006 .

[13]  James Demmel,et al.  A Supernodal Approach to Sparse Partial Pivoting , 1999, SIAM J. Matrix Anal. Appl..

[14]  John K. Reid,et al.  The Multifrontal Solution of Indefinite Sparse Symmetric Linear , 1983, TOMS.

[15]  Charles L. Lawson,et al.  Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.

[16]  Jianfeng Wang,et al.  GPU Solutions to Multi-scale Problems in Science and Engineering , 2011 .

[17]  Joseph W. H. Liu The role of elimination trees in sparse factorization , 1990 .

[18]  Murat Efe Guney,et al.  On the limits of GPU acceleration , 2010 .

[19]  Jack Dongarra,et al.  Scientific Computing with Multicore and Accelerators , 2010, Chapman and Hall / CRC computational science series.

[20]  Fred G. Gustavson,et al.  Two Fast Algorithms for Sparse Matrices: Multiplication and Permuted Transposition , 1978, TOMS.

[21]  Timothy A. Davis,et al.  Direct methods for sparse linear systems , 2006, Fundamentals of algorithms.