论文信息 - Generating efficient parallel code for successive over-relaxation

Generating efficient parallel code for successive over-relaxation

A complete suite of algorithms for parallelizing compilers to generate efficient SPMD code for SOR problems is presented. By applying unimodular transformation before loop tiling and parallelization, the number of messages per iteration per processor is reduced from 3/sup n/-1 in the conventional parallel SOR algorithm to 2/sup n/-1, where n is the dimensionality of the data set. To maintain the memory-scalability, a novel approach to use the local dynamic memory of parallel processors to implement the skewed data set is proposed.

Peiyi Tang

[1] P. Tang. Generating Eecient Parallel Code for Successive Over-relaxation Generating Eecient Parallel Code for Successive Over-relaxation , 2007 .

[2] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.

[3] Monica S. Lam,et al. A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..

[4] Corinne Ancourt,et al. Scanning polyhedra with DO loops , 1991, PPOPP '91.

[5] William Pugh,et al. A practical algorithm for exact array dependence analysis , 1992, CACM.

[6] Peiyi Tang,et al. Reducing data communication overhead for DOACROSS loop nests , 1994, ICS '94.

[7] G. C. Fox,et al. Solving Problems on Concurrent Processors , 1988 .