A study of process arrival patterns for MPI collective operations
暂无分享,去创建一个
[1] Yves Robert,et al. Pipelining Broadcasts on Heterogeneous Platforms , 2005, IEEE Trans. Parallel Distributed Syst..
[2] Rami G. Melhem,et al. Algorithms for Supporting Compiled Communication , 2003, IEEE Trans. Parallel Distributed Syst..
[3] Xin Yuan,et al. Automatic generation and tuning of MPI collective communication routines , 2005, ICS '05.
[4] Xin Yuan,et al. STAR-MPI: self tuned adaptive routines for MPI collective operations , 2006, ICS '06.
[5] Ahmad Faraj,et al. Communication Characteristics in the NAS Parallel Benchmarks , 2002, IASTED PDCS.
[6] Xin Yuan,et al. An MPI prototype for compiled communication on Ethernet switched clusters , 2005, J. Parallel Distributed Comput..
[7] Basel A. Mahafzah,et al. Statistical analysis of message passing programs to guide computer design , 1998, Proceedings of the Thirty-First Hawaii International Conference on System Sciences.
[8] Quentin F. Stout,et al. The Use of the MPI Communication Library in the NAS Parallel Benchmarks , 1999 .
[9] Eli Upfal,et al. Efficient Algorithms for All-to-All Communications in Multiport Message-Passing Systems , 1997, IEEE Trans. Parallel Distributed Syst..
[10] Sathish S. Vadhiyar,et al. Automatically Tuned Collective Communications , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[11] Xin Yuan,et al. A Message Scheduling Scheme for All-to-All Personalized Communication on Ethernet Switched Clusters , 2007, IEEE Transactions on Parallel and Distributed Systems.
[12] Xin Yuan,et al. Pipelined broadcast on Ethernet switched clusters , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[13] Quentin F. Stout,et al. Statistical Analysis of Communication Time on the IBM SP2 , 2008 .
[14] Jeffrey S. Vetter,et al. An Empirical Performance Evaluation of Scalable Scientific Applications , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[15] Jack J. Dongarra,et al. Performance Analysis of MPI Collective Operations , 2005, IPDPS.
[16] I. Rosenblum,et al. MULTI-PROCESSOR MOLECULAR DYNAMICS USING THE BRENNER POTENTIAL: PARALLELIZATION OF AN IMPLICIT MULTI-BODY POTENTIAL , 1999 .
[17] Anthony Skjellum,et al. A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..
[18] F. Petrini,et al. The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q , 2003, ACM/IEEE SC 2003 Conference (SC'03).