论文信息 - Performance Tuning of Matrix Triple Products Based on Matrix Structure

Performance Tuning of Matrix Triple Products Based on Matrix Structure

Sparse matrix computations arise in many scientific and engineering applications, but their performance is limited by the growing gap between processor and memory speed. In this paper, we present a case study of an important sparse matrix triple product problem that commonly arises in primal-dual optimization method. Instead of a generic two-phase algorithm, we devise and implement a single pass algorithm that exploits the block diagonal structure of the matrix. Our algorithm uses fewer floating point operations and roughly half the memory of the two-phase algorithm. The speed-up of the one-phase scheme over the two-phase scheme is 2.04 on a 900 MHz Intel Itanium-2, 1.63 on an 1 GHz Power-4, and 1.99 on a 900 MHz Sun Ultra-3.

Ismail Bustany | James Demmel | Katherine A. Yelick | Cleve Ashcraft | Eun-Jin Im

[1] John R. Gilbert,et al. Sparse Matrices in MATLAB: Design and Implementation , 1992, SIAM J. Matrix Anal. Appl..

[2] James Demmel,et al. Memory Hierarchy Optimizations and Performance ounds for Sparse A , 2003, International Conference on Computational Science.

[3] Katherine A. Yelick,et al. Optimizing Sparse Matrix Computations for Register Reuse in SPARSITY , 2001, International Conference on Computational Science.

[4] Richard W. Vuduc,et al. Sparsity: Optimization Framework for Sparse Matrix Kernels , 2004, Int. J. High Perform. Comput. Appl..

[5] David P. Williamson,et al. Primal-Dual Approximation Algorithms for Integral Flow and Multicut in Trees, with Applications to Matching and Set Cover , 1993, ICALP.