Integer sorting on shared-memory vector parallel computers
暂无分享,去创建一个
This paper describes new fast integer sorting methods for single vector and shared-memory parallel vector computers, based on the bucket sort algorithm. Existing vectorization methods for bucket sort have made great efforts to avoid store conflicts of vector scatter operations, and therefore are not so efftcient. The vectorization methods shown in this paper-the retry method, the split vector method and the mask vector method-all actively utilize the nature of the store conflicts to achieve high performance. The parallelization method in this paper uses a feature of shared-memory machines and dynamically changes the partitioning of histogram arrays without any overhead. By combining the retry and the parallelization methods, we got the worlds fastest results for the IS program (Class B) in the NAS Parallel Benchmarks on the NBC $X4. Our methods are also applicable to a wide range of particle simulation programs.
[1] David H. Bailey,et al. The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..
[2] David H. Bailey,et al. NAS parallel benchmark results , 1992, Proceedings Supercomputing '92.
[3] Kenichi Miura,et al. A Vector-Parallel Implementation and Statistical Analysis of the Bucket Sort on a Vector-Parallel Distributed Memory System: Lessons Learned in the Integer Sort NAS Parallel Benchmark , 1995, PPSC.