Comparative analysis of approaches to hardware acceleration for sparse-matrix factorization
暂无分享,去创建一个
The authors compare two standard approaches to sparse LU (lower-upper) factorization, namely the compiled-code approach and the scatter-gather approach, with respect to three criteria that are relevant in the context of multiprocessor hardware acceleration: idealized parallelism, memory access costs, and storage requirements. The compiled-code approach was shown to be the clear winner with respect to the first metric, while the scatter-gather approach had much lower memory access cost and storage requirements. The use of a data structure in which rows of the sparse matrix are stored in an overlapped fashion along with the representation of a row-level operation as a single task was then proposed as a good compromise solution. The idealized parallelism with this approach was shown to be between that of the previous two approaches; its memory access cost was the same as with the scatter-gather approach, while its storage requirement was seen to be only moderately worse.<<ETX>>
[1] Iain S. Duff,et al. Direct methods for sparse matrices27100 , 1986 .
[2] P. Sadayappan,et al. Circuit Simulation on Shared-Memory Multiprocessors , 1988, IEEE Trans. Computers.
[3] Robert E. Tarjan,et al. Storing a sparse table , 1979, CACM.
[4] Omar Wing,et al. A Computation Model of Parallel Solution of Linear Equations , 1980, IEEE Transactions on Computers.