ConnectX-2 CORE-Direct Enabled Asynchronous Broadcast Collective Communications
暂无分享,去创建一个
Manjunath Gorentla Venkata | Pavel Shamis | Gilad Shainer | Richard L. Graham | Joshua Ladd | Ishai Rabinovitz | Vasily Filipov
[1] Christopher Wilson,et al. COMB: a portable benchmark suite for assessing MPI overlap , 2002, Proceedings. IEEE International Conference on Cluster Computing.
[2] Stephen W. Poole,et al. Overlapping computation and communication: Barrier algorithms and ConnectX-2 CORE-Direct capabilities , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).
[3] George Bosilca,et al. Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation , 2004, PVM/MPI.
[4] Pavel Shamis,et al. Network Offloaded Hierarchical Collectives Using ConnectX-2's CORE-Direct Capabilities , 2010, EuroMPI.
[5] Steve Poole,et al. ConnectX-2 InfiniBand Management Queues: First Investigation of the New Support for Network Offloaded Collective Operations , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.
[6] Dhabaleswar K. Panda,et al. High performance and reliable NIC-based multicast over Myrinet/GM-2 , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..
[7] F. Petrini,et al. The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q , 2003, ACM/IEEE SC 2003 Conference (SC'03).
[8] Sayantan Sur,et al. High Performance Broadcast Support in La-Mpi Over Quadrics , 2005, Int. J. High Perform. Comput. Appl..
[9] Amith R. Mamidala,et al. MPI Collective Communications on The Blue Gene/P Supercomputer: Algorithms and Optimizations , 2009, Hot Interconnects.
[10] Ronald Mraz,et al. Reducing the variance of point to point transfers in the IBM 9076 parallel computer , 1994, Proceedings of Supercomputing '94.
[11] Henri E. Bal,et al. Efficient multicast on Myrinet using link-level flow control , 1998, Proceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205).
[12] Manjunath Gorentla Venkata,et al. Cheetah: A Framework for Scalable Hierarchical Collective Operations , 2011, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.
[13] D. K. Panda. InfiniBand Architecture , 2001 .
[14] Jesper Larsson Träff. A Simple Work-Optimal Broadcast Algorithm for Message-Passing Parallel Systems , 2004, PVM/MPI.
[15] Kees Verstoep,et al. Efficient reliable multicast on Myrinet , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.