Kernel-assisted and topology-aware MPI collective communications on multicore/many-core platforms
暂无分享,去创建一个
George Bosilca | Jack J. Dongarra | Teng Ma | Aurelien Bouteiller | J. Dongarra | Aurélien Bouteiller | G. Bosilca | Teng Ma
[1] Kevin T. Pedretti,et al. SMARTMAP: Operating system support for efficient data sharing among processes on a multi-core processor , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[2] Amith R. Mamidala,et al. Efficient Shared Memory and RDMA Based Design for MPI_Allgather over InfiniBand , 2006, PVM/MPI.
[3] Hao Zhu,et al. Hierarchical Collectives in MPICH2 , 2009, PVM/MPI.
[4] Ron Brightwell,et al. Exploiting Direct Access Shared Memory for MPI On Multi-Core Processors , 2010, Int. J. High Perform. Comput. Appl..
[5] Wenguang Chen,et al. MPIPP: an automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters , 2006, ICS '06.
[6] Lars Paul Huse. Collective Communication on Dedicated Clusters of Workstations , 1999, PVM/MPI.
[7] Brice Goglin,et al. Dodging Non-uniform I/O Access in Hierarchical Collective Operations for Multicore Clusters , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[8] Guillaume Mercier,et al. Cache-Efficient, Intranode, Large-Message MPI Communication with MPICH2-Nemesis , 2009, 2009 International Conference on Parallel Processing.
[9] George Bosilca,et al. Kernel Assisted Collective Intra-node MPI Communication among Multi-Core and Many-Core CPUs , 2011, 2011 International Conference on Parallel Processing.
[10] Thomas Hérault,et al. Process Distance-Aware Adaptive MPI Collective Communications , 2011, 2011 IEEE International Conference on Cluster Computing.
[11] Ying Qian,et al. RDMA-based and SMP-aware Multi-port All-Gather on Multi-rail QsNet^II SMP Clusters , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).
[12] Sayantan Sur,et al. LiMIC: support for high-performance MPI intra-node communication on Linux cluster , 2005, 2005 International Conference on Parallel Processing (ICPP'05).
[13] Bronis R. de Supinski,et al. Exploiting hierarchy in parallel computer networks to optimize collective operation performance , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[14] George Bosilca,et al. HierKNEM: An Adaptive Framework for Kernel-Assisted and Topology-Aware Collective Communications on Many-core Clusters , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[15] Rajeev Thakur,et al. Improving the Performance of Collective Operations in MPICH , 2003, PVM/MPI.
[16] Amith R. Mamidala,et al. Efficient SMP-aware MPI-level broadcast over InfiniBand's hardware multicast , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[17] Bronis R. de Supinski,et al. A Multilevel Approach to Topology-Aware Collective Operations in Computational Grids , 2002, ArXiv.
[18] Dhabaleswar K. Panda,et al. Designing topology-aware collective communication algorithms for large scale InfiniBand clusters: Case studies with Scatter and Gather , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).
[19] Guillaume Mercier,et al. hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.
[20] Dhabaleswar K. Panda,et al. Designing multi-leader-based Allgather algorithms for multi-core clusters , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.