Characterizing dataset dependence for sparse matrix-vector multiplication on GPUs

Sparse matrix-vector multiplication (SpMV) is a widely used kernel in scientific applications as well as data analytics. Many GPU implementations of SpMV have been proposed, proposing different sparse matrix representations. However, no sparse matrix representation is consistently superior, and the best representation varies for sparse matrices with different sparsity patterns. In this paper we study four popular sparse representations implemented in the NVIDIA cuSPARSE library: CSR, ELL, COO and a hybrid ELL-COO scheme. We analyze statistical features of a dataset of 27 matrices, covering a wide spectrum of sparsity features, and attempt to correlate SpMV performance with each representation with simple aggregate metrics of the matrices. We present some insights on the correlation between matrix features and the best choice for sparse matrix representation.

[1]  Richard Vuduc,et al.  Automatic performance tuning of sparse matrix kernels , 2003 .

[2]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[3]  Yao Zhang,et al.  Scan primitives for GPU computing , 2007, GH '07.

[4]  Srinivasan Parthasarathy,et al.  Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[5]  P. Sadayappan,et al.  An efficient two-dimensional blocking strategy for sparse matrix-vector multiplication on GPUs , 2014, ICS '14.

[6]  Shengen Yan,et al.  yaSpMV: yet another SpMV framework on GPUs , 2014, PPoPP.

[7]  Michael Garland,et al.  Efficient Sparse Matrix-Vector Multiplication on CUDA , 2008 .

[8]  Srinivasan Parthasarathy,et al.  Fast Sparse Matrix-Vector Multiplication on GPUs: Implications for Graph Mining , 2011, Proc. VLDB Endow..

[9]  Xing Liu,et al.  Efficient sparse matrix-vector multiplication on x86-based many-core processors , 2013, ICS '13.

[10]  Rajesh Bordawekar,et al.  Optimizing Sparse Matrix-Vector Multiplication on GPUs , 2009 .

[11]  Samuel Williams,et al.  Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[12]  Eric S. Chung,et al.  SpMV: A Memory-Bound Application on the GPU Stuck Between a Rock and a Hard Place , 2012 .

[13]  I. Reguly,et al.  Efficient sparse matrix-vector multiplication on cache-based GPUs , 2012, 2012 Innovative Parallel Computing (InPar).

[14]  Richard W. Vuduc,et al.  Model-driven autotuning of sparse matrix-vector multiply on GPUs , 2010, PPoPP '10.

[15]  P. Sadayappan,et al.  High-performance sparse matrix-vector multiplication on GPUs for structured grid computations , 2012, GPGPU-5.

[16]  Michael Garland,et al.  Implementing sparse matrix-vector multiplication on throughput-oriented processors , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.