Performance Enhancement for Matrix Multiplication on an SMP PC Cluster
暂无分享,去创建一个
[1] Tsutomu Yoshinaga,et al. Construction of Hybrid MPI-OpenMP Solutions for SMP Clusters , 2005 .
[2] Franck Cappello,et al. Intra node parallelization of MPI programs with OpenMP , 1998 .
[3] Franck Cappello,et al. Investigating the performance of two programming models for clusters of SMP PCs , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).
[4] J. Choi,et al. A fast scalable universal matrix multiplication algorithm on distributed-memory concurrent computers , 1997, Proceedings 11th International Parallel Processing Symposium.
[5] Rolf Rabenseifner,et al. Hybrid Parallel Programming: Performance Problems and Chances , 2003 .
[6] Jaeyoung Choi,et al. Pumma: Parallel universal matrix multiplication algorithms on distributed memory concurrent computers , 1994, Concurr. Pract. Exp..
[7] Franck Cappello,et al. MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[8] Mitsuhisa Sato,et al. Implementation and performance evaluation of SPAM particle code with MPI-OpenMP hybrid programming , 2001 .
[9] Robert A. van de Geijn,et al. SUMMA: scalable universal matrix multiplication algorithm , 1995, Concurr. Pract. Exp..
[10] Tsutomu Yoshinaga,et al. A Hybrid MPI-OpenMP Solution for a Linear System on a Cluster of SMPs , 2003 .
[11] Gerhard Wellein,et al. Fast Sparse Matrix-Vector Multiplication for TeraFlop/s Computers , 2002, VECPAR.
[12] Taisuke Boku,et al. Implementation and performance evaluation of SPAM particle code with OpenMP-MPI hybrid programming , 2007 .
[13] Gerhard Wellein,et al. Communication and Optimization Aspects of Parallel Programming Models on Hybrid Architectures , 2003, Int. J. High Perform. Comput. Appl..