论文信息 - Communication-optimal iterative methods

Communication-optimal iterative methods

Data movement, both within the memory system of a single processor node and between multiple nodes in a system, limits the performance of many Krylov subspace methods that solve sparse linear systems and eigenvalue problems. Here, s iterations of algorithms such as CG, GMRES, Lanczos, and Arnoldi perform s sparse matrix-vector multiplications and ?(s) vector reductions, resulting in a growth of ?(s) in both single-node and network communication. By reorganizing the sparse matrix kernel to compute a set of matrix-vector products at once and reorganizing the rest of the algorithm accordingly, we can perform s iterations by sending O(log P) messages instead of ?(s?log P) messages on a parallel machine, and reading the on-node components of the matrix A from DRAM to cache just once on a single node instead of s times. This reduces communication to the minimum possible. We discuss both algorithms and an implementation of GMRES on a single node of an 8-core Intel Clovertown. Our implementations achieve significant speedups over the conventional algorithms.

James Demmel | Mark Hoemmen | Katherine Yelick | Marghoob Mohiyuddin

[1] H. Walker. Implementation of the GMRES method using householder transformations , 1988 .

[2] W. Joubert,et al. Parallelizable restarted iterative methods for nonsymmetric linear systems. part I: Theory , 1992 .

[3] Sivan Toledo,et al. Quantitative performance modeling of scientific computations and creating locality in numerical algorithms , 1995 .

[4] Samuel Williams,et al. Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[5] James Demmel,et al. Minimizing communication in sparse matrix solvers , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[6] James Demmel,et al. Avoiding communication in sparse matrix computations , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[7] Anthony T. Chronopoulos,et al. s-step iterative methods for symmetric linear systems , 1989 .

[8] R Vichnevetsky,et al. IMACS '91: Proceedings of the IMACS World Congress on Computation and Applied Mathematics (13th) Held in Dublin, Ireland on July 22-26, 1991. Volume 2. Computational Fluid Dynamics and Wave Propagation, Parallel Computing, Concurrent and Supercomputing, Computational Physics/Computational Chemistry , 1991 .

[9] Y. Saad,et al. GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems , 1986 .