An efficient parallel algorithm for the 3-D FFT NAS parallel benchmark
暂无分享,去创建一个
We propose an efficient algorithm to implement the 3D NAS FFT benchmark. The proposed algorithm overlaps the communication with the computation. On parallel machines supporting overlap of communication with computation, the proposed algorithm can outperform the non-overlapping version of this algorithm by a factor close to two.<<ETX>>
[1] David H. Bailey,et al. Performance results for two of the NAS parallel benchmarks , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).