Scalable PGAS collective operations in NUMA clusters
暂无分享,去创建一个
Jorge González-Domínguez | Guillermo L. Taboada | Damián A. Mallón | Carlos Teijeiro | Andrés Gómez | Brian Wibecan
[1] Emmanuel Jeannot,et al. Near-Optimal Placement of MPI Processes on Hierarchical NUMA Architectures , 2010, Euro-Par.
[2] Robert A. van de Geijn,et al. Collective communication on architectures that support simultaneous communication over multiple links , 2006, PPoPP '06.
[3] Dhabaleswar K. Panda,et al. Designing multi-leader-based Allgather algorithms for multi-core clusters , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[4] Rajeev Thakur,et al. Optimization of Collective Communication Operations in MPICH , 2005, Int. J. High Perform. Comput. Appl..
[5] Joseph Antony,et al. Exploring Thread and Memory Placement on NUMA Architectures: Solaris and Linux, UltraSPARC/FirePlane and Opteron/HyperTransport , 2006, HiPC.
[6] Ashok Srinivasan,et al. Optimization of Collective Communication in Intra-cell MPI , 2007, HiPC.
[7] Galen M. Shipman,et al. X-SRQ- Improving Scalability and Performance of Multi-core InfiniBand Clusters , 2008, PVM/MPI.
[8] Katherine A. Yelick,et al. Tuning collective communication for Partitioned Global Address Space programming models , 2011, Parallel Comput..
[9] Kevin M. Lepak,et al. Cache Hierarchy and Memory Subsystem of the AMD Opteron Processor , 2010, IEEE Micro.
[10] Katherine Yelick,et al. Optimizing collective communication on multicores , 2009 .
[11] Ahmed Sameh,et al. Potential Performance Improvement of Collective Operations in UPC , 2007, PARCO.
[12] Hyun-Wook Jin,et al. High performance MPI-2 one-sided communication over InfiniBand , 2004, IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004..
[13] Torsten Hoefler,et al. NUMA-aware shared-memory collective communication for MPI , 2013, HPDC.
[14] Ying Qian,et al. Design and Evaluation of Efficient Collective Communications on Modern Interconnects and Multi-core Clusters , 2010 .
[15] Fernando Obelleiro Basteiro,et al. High scalability multipole method. Solving half billion of unknowns , 2009, Computer Science - Research and Development.
[16] Dhabaleswar K. Panda,et al. Scalable MPI design over InfiniBand using eXtended Reliable Connection , 2008, 2008 IEEE International Conference on Cluster Computing.
[17] Francisco F. Rivera,et al. On the Influence of Thread Allocation for Irregular Codes in NUMA Systems , 2009, 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies.
[18] Galen M. Shipman,et al. MPI Support for Multi-core Architectures: Optimized Shared Memory Collectives , 2008, PVM/MPI.
[19] Jiulong Shan,et al. Single Data Copying for MPI Communication Optimization on Shared Memory System , 2007, International Conference on Computational Science.
[20] Amith R. Mamidala,et al. Scaling alltoall collective on multi-core systems , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[21] Kevin T. Pedretti,et al. Optimizing Multi-core MPI Collectives with SMARTMAP , 2009, 2009 International Conference on Parallel Processing Workshops.
[22] Torsten Hoefler,et al. A practically constant-time MPI Broadcast Algorithm for large-scale InfiniBand Clusters with Multicast , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[23] Raymond Namyst,et al. A multithreaded communication engine for multicore architectures , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[24] Dhabaleswar K. Panda,et al. Designing topology-aware collective communication algorithms for large scale InfiniBand clusters: Case studies with Scatter and Gather , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).
[25] Xiaofang Zhao,et al. Performance analysis and optimization of MPI collective operations on multi-core clusters , 2009, The Journal of Supercomputing.
[26] Amith R. Mamidala,et al. MPI Collectives on Modern Multicore Clusters: Performance Optimizations and Communication Characteristics , 2008, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID).
[27] Sathish S. Vadhiyar,et al. Automatically Tuned Collective Communications , 2000, ACM/IEEE SC 2000 Conference (SC'00).