论文信息 - Optimizing Sparse Matrix Vector Multiplication on SMP

Optimizing Sparse Matrix Vector Multiplication on SMP

We describe optimizations of sparse matrix-vector multiplication on uniprocessors and SMPs. The optimization techniques include register blocking, cache blocking, and matrix reordering. We focus on optimizations that improve performance on SMPs, in particular, matrix reordering implemented using two diierent graph algorithms. We present a performance study of this algorithmic kernel, showing how the optimization techniques aaect absolute performance and scalability, how they interact with one another, and how the performance beneets depend on matrix structure.

E. Im | K. Yelick

[1] A. H. Sherman,et al. Comparative Analysis of the Cuthill–McKee and the Reverse Cuthill–McKee Ordering Algorithms for Sparse Matrices , 1976 .

[2] Jack J. Dongarra,et al. An extended set of FORTRAN basic linear algebra subprograms , 1988, TOMS.

[3] Bruce Hendrickson,et al. A Multi-Level Algorithm For Partitioning Graphs , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[4] John N. Shadid,et al. Aztec user`s guide. Version 1 , 1995 .

[5] Mark T. Jones,et al. BlockSolve95 users manual: Scalable library software for the parallel solution of sparse linear systems , 1995 .