SLOAVx: Scalable LOgarithmic AlltoallV Algorithm for Hierarchical Multicore Systems
暂无分享,去创建一个
Cong Xu | Weikuan Yu | Yandong Wang | Manjunath Gorentla Venkata | Zhuo Liu | Richard L. Graham | R. Graham | Weikuan Yu | Cong Xu | Yandong Wang | Zhuo Liu
[1] Brice Goglin,et al. KNEM: A generic and scalable kernel-assisted intra-node MPI communication framework , 2013, J. Parallel Distributed Comput..
[2] Adrian Jackson. Planned AlltoAllv , .
[3] David H. Bailey,et al. The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..
[4] Jehoshua Bruck,et al. Efficient algorithms for all-to-all communications in multi-port message-passing systems , 1994, SPAA '94.
[5] Manjunath Gorentla Venkata,et al. Exploring the All-to-All Collective Optimization Space with ConnectX CORE-Direct , 2012, 2012 41st International Conference on Parallel Processing.
[6] Ahmad Faraj,et al. Communication Characteristics in the NAS Parallel Benchmarks , 2002, IASTED PDCS.
[7] Manjunath Gorentla Venkata,et al. Cheetah: A Framework for Scalable Hierarchical Collective Operations , 2011, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.
[8] Dhabaleswar K. Panda,et al. Scalable, high-performance NIC-based all-to-all broadcast over Myrinet/GM , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).
[9] M. Plummer,et al. An LPAR-customized MPI_AllToAllV for the Materials Science code CASTEP , 2004 .
[10] Roger W. Hockney,et al. The Communication Challenge for MPP: Intel Paragon and Meiko CS-2 , 1994, Parallel Computing.
[11] Xin Yuan,et al. Automatic generation and tuning of MPI collective communication routines , 2005, ICS '05.
[12] George Bosilca,et al. Kernel Assisted Collective Intra-node MPI Communication among Multi-Core and Many-Core CPUs , 2011, 2011 International Conference on Parallel Processing.
[13] Eli Upfal,et al. Efficient Algorithms for All-to-All Communications in Multiport Message-Passing Systems , 1997, IEEE Trans. Parallel Distributed Syst..
[14] Keith D. Underwood,et al. An analysis of the impact of MPI overlap and independent progress , 2004, ICS '04.