An MPI Halo-Cell Implementation for Zero-Copy Abstraction
暂无分享,去创建一个
Allen D. Malony | Sameer Shende | Julien Jaeger | Patrick Carribault | Marc Pérache | Jean-Baptiste Besnard | A. Malony | S. Shende | Marc Pérache | Patrick Carribault | Jean-Baptiste Besnard | Julien Jaeger
[1] Michail Alvanos,et al. Memory Management Techniques for Exploiting RDMA in PGAS Languages , 2014, LCPC.
[2] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .
[3] Torsten Hoefler,et al. Hybrid MPI: Efficient message passing for multi-core systems , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[4] Raymond Namyst,et al. MPC: A Unified Parallel Runtime for Clusters of NUMA Machines , 2008, Euro-Par.
[5] Brice Goglin,et al. KNEM: A generic and scalable kernel-assisted intra-node MPI communication framework , 2013, J. Parallel Distributed Comput..
[6] Guillaume Mercier,et al. Design and evaluation of Nemesis, a scalable, low-latency, message-passing communication subsystem , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).
[7] Barbara M. Chapman,et al. Introducing OpenSHMEM: SHMEM for the PGAS community , 2010, PGAS '10.
[8] Torsten Hoefler,et al. Ownership passing: efficient distributed memory programming on multi-core systems , 2013, PPoPP '13.
[9] Bradley C. Kuszmaul,et al. Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.
[10] John Shalf,et al. The International Exascale Software Project roadmap , 2011, Int. J. High Perform. Comput. Appl..
[11] Guillaume Mercier,et al. Data Transfers between Processes in an SMP System: Performance Study and Application to MPI , 2006, 2006 International Conference on Parallel Processing (ICPP'06).
[12] Robert W. Numrich,et al. Co-array Fortran for parallel programming , 1998, FORF.
[13] Sayantan Sur,et al. Unifying UPC and MPI runtimes: experience with MVAPICH , 2010, PGAS '10.
[14] Alan Wagner,et al. FG-MPI: Fine-grain MPI for multicore and clusters , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).
[15] Robert J. Harrison,et al. Global arrays: A nonuniform memory access programming model for high-performance computers , 1996, The Journal of Supercomputing.