Communication overhead on the CM5: an experimental performance evaluation
暂无分享,去创建一个
The authors present experimental results for communication overhead on the scalable parallel machine CM-5. It is observed that the communication latency of the data network is 88 mu s. It was also observed that the communication cost for messages that are a multiple of 16 bytes is much smaller than for messages that are not, and therefore, for better performance, a user should pad messages to make them a multiple of 16 bytes. The authors also studied the communication overhead of three complete exchange algorithms. For small message sizes, the recursive exchange algorithm performs the best, especially for large multiprocessors. However, for large message sizes, the pairwise exchange algorithm is preferable. Finally, the authors studied two algorithms for one-to-all broadcast: the linear broadcast algorithm and the recursive broadcast algorithm. Linear broadcast does not perform well; the recursive broadcast algorithm performs well.<<ETX>>
[1] S. Lennart Johnsson,et al. Algorithms for Matrix Transposition on Boolean n-Cube Configured Ensemble Architectures , 1988, ICPP.
[2] S. Lennart Johnsson,et al. Optimum Broadcasting and Personalized Communication in Hypercubes , 1989, IEEE Trans. Computers.
[3] Dirk Roose,et al. Benchmarking the iPSC/2 Hypercube Multiprocessor , 1989, Concurr. Pract. Exp..