Scalable Algorithms for MPI Intergroup Allgather and Allgatherv
暂无分享,去创建一个
Wei-keng Liao | Alok N. Choudhary | Ankit Agrawal | Reda Al-Bahrani | Jesper Larsson Träff | Qiao Kang
[1] Fan Zhang,et al. Enabling In-situ Execution of Coupled Scientific Workflow on Multi-core Platform , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[2] Eli Upfal,et al. Efficient Algorithms for All-to-All Communications in Multiport Message-Passing Systems , 1997, IEEE Trans. Parallel Distributed Syst..
[3] Jesper Larsson Träff,et al. Optimal broadcast for fully connected processor-node networks , 2008, J. Parallel Distributed Comput..
[4] Jesper Larsson Träff,et al. A Pipelined Algorithm for Large, Irregular All-Gather Problems , 2010, Int. J. High Perform. Comput. Appl..
[5] William Gropp,et al. Fault Tolerance in Message Passing Interface Programs , 2004, Int. J. High Perform. Comput. Appl..
[6] Robert A. van de Geijn,et al. Collective communication: theory, practice, and experience , 2007, Concurr. Comput. Pract. Exp..
[7] George Karypis,et al. Introduction to Parallel Computing , 1994 .
[8] Pedro V. Silva,et al. Implementing MPI-2 Extended Collective Operations , 1999, PVM/MPI.
[9] Rajeev Thakur,et al. Optimization of Collective Communication Operations in MPICH , 2005, Int. J. High Perform. Comput. Appl..
[10] Jesper Larsson Träff,et al. Two-tree algorithms for full bandwidth broadcast, reduction and scan , 2009, Parallel Comput..
[11] George Bosilca,et al. Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation , 2004, PVM/MPI.
[12] S. Lennart Johnsson,et al. Optimum Broadcasting and Personalized Communication in Hypercubes , 1989, IEEE Trans. Computers.
[13] Amith R. Mamidala,et al. Efficient Shared Memory and RDMA Based Design for MPI_Allgather over InfiniBand , 2006, PVM/MPI.
[14] Joshua P. Hacker,et al. Ensemble Data Assimilation to Characterize Surface-Layer Errors in Numerical Weather Prediction Models , 2013 .
[15] Fan Zhang,et al. ActiveSpaces: Exploring dynamic code deployment for extreme scale data processing , 2015, Concurr. Comput. Pract. Exp..
[16] Jesper Larsson Träff,et al. Full-Duplex Inter-Group All-to-All Broadcast Algorithms with Optimal Bandwidth , 2018, EuroMPI.
[17] Alok N. Choudhary,et al. A flexible I/O arbitration framework for netCDF‐based big data processing workflows on high‐end supercomputers , 2017, Concurr. Comput. Pract. Exp..