Analyzing the Impact of Overlap, Offload, and Independent Progress for Message Passing Interface Applications
暂无分享,去创建一个
[1] Dhabaleswar K. Panda,et al. Performance Comparison of MPI Implementations over InfiniBand, Myrinet and Quadrics , 2003, ACM/IEEE SC 2003 Conference (SC'03).
[2] Remzi H. Arpaci-Dusseau,et al. Architectural Requirements and Scalability of the NAS Parallel Benchmarks , 1999, ACM/IEEE SC 1999 Conference (SC'99).
[3] Keith D. Underwood,et al. Characterizing a new class of threads in scientific applications for high end supercomputers , 2004, ICS '04.
[4] Wu-chun Feng,et al. The Quadrics Network: High-Performance Clustering Technology , 2002, IEEE Micro.
[5] Dhabaleswar K. Panda,et al. Microbenchmark performance comparison of high-speed cluster interconnects , 2004, IEEE Micro.
[6] David Scott,et al. A TeraFLOP supercomputer in 1996: the ASCI TFLOP system , 1996, Proceedings of International Conference on Parallel Processing.
[7] Scott Pakin,et al. Identifying and Eliminating the Performance Variability on the ASCI Q Machine , 2003 .
[8] Dhabaleswar K. Panda,et al. High performance RDMA-based MPI implementation over InfiniBand , 2003, ICS.
[9] Charles L. Seitz,et al. Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.
[10] Richard P. Martin,et al. Effects Of Communication Latency, Overhead, And Bandwidth In A Cluster Architecture , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[11] Keith D. Underwood,et al. An analysis of NIC resource usage for offloading MPI , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[12] Jeffrey S. Vetter,et al. Communication characteristics of large-scale scientific applications for contemporary cluster architectures , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.
[13] Ron Brightwell. A New MPI Implementation for Cray SHMEM , 2004, PVM/MPI.
[14] Keith D. Underwood,et al. The impact of MPI queue usage on message latency , 2004, International Conference on Parallel Processing, 2004. ICPP 2004..
[15] Wolfgang Rehm,et al. Implementing an MPICH-2 channel device over VAPI on InfiniBand , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[16] Rolf Riesen,et al. Portals 3.0: protocol building blocks for low overhead communication , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.
[17] Keith D. Underwood,et al. An Initial Analysis of the Impact of Overlap and Independent Progress for MPI , 2004, PVM/MPI.
[18] Ron Brightwell,et al. The Portals 3.0 Message Passing Interface Revision 1.0 , 1999 .
[19] F. Petrini,et al. The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q , 2003, ACM/IEEE SC 2003 Conference (SC'03).
[20] Wolfgang Rehm,et al. An MPICH 2 Channel Device Implementation over VAPI on InfiniBand , 2004 .
[21] Keith D. Underwood,et al. Evaluation of an Eager Protocol Optimization for MPI , 2003, PVM/MPI.
[22] Message Passing Interface Forum. MPI: A message - passing interface standard , 1994 .
[23] R. Brightwell,et al. Design and implementation of MPI on Puma portals , 1996, Proceedings. Second MPI Developer's Conference.
[24] Sushmitha P. Kini,et al. Performance Comparison of MPI Implementations over InfiniBand, Myrinet and Quadrics , 2003, ACM/IEEE SC 2003 Conference (SC'03).
[25] Alex Rapaport,et al. Mpi-2: extensions to the message-passing interface , 1997 .
[26] Rossen Dimitrov,et al. Impact of Latency on Applications’ Performance , 2001 .