Application Level Reordering of Remote Direct Memory Access Operations
暂无分享,去创建一个
[1] Katherine A. Yelick,et al. An Evaluation of One-Sided and Two-Sided Communication Paradigms on Relaxed-Ordering Interconnect , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[2] José Duato,et al. A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks , 1993, IEEE Trans. Parallel Distributed Syst..
[3] William J. Dally. Virtual-channel flow control , 1990, ISCA '90.
[4] Csaba Andras Moritz,et al. LoGPC: Modeling Network Contention in Message-Passing Programs , 2001, IEEE Trans. Parallel Distributed Syst..
[5] Rajeev Thakur,et al. Optimization of Collective Communication Operations in MPICH , 2005, Int. J. High Perform. Comput. Appl..
[6] Katherine A. Yelick,et al. Optimizing bandwidth limited problems using one-sided communication and overlap , 2005, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[7] Amith R. Mamidala,et al. Looking under the hood of the IBM Blue Gene/Q network , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[8] Chris J. Scheiman,et al. LogGP: Incorporating Long Messages into the LogP Model for Parallel Computation , 1997, J. Parallel Distributed Comput..
[9] Mike Higgins,et al. Cray Cascade: A scalable HPC system based on a Dragonfly network , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[10] Leonid Oliker,et al. merAligner: A Fully Parallel Sequence Aligner , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.
[11] Katherine A. Yelick,et al. Scaling communication-intensive applications on BlueGene/P using one-sided communication and overlap , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[12] Steven G. Johnson,et al. The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.
[13] David H. Bailey,et al. The NAS parallel benchmarks summary and preliminary results , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[14] Steven L. Scott,et al. The Cray T3E Network: Adaptive Routing in a High Performance 3D Torus , 1996 .
[15] Paul D. Gader,et al. Image algebra techniques for parallel image processing , 1987 .
[16] Ramesh Subramonian,et al. LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.
[17] Dan Bonachea. GASNet Specification, v1.1 , 2002 .
[18] Samuel Williams,et al. Optimization of geometric multigrid for emerging multi- and manycore processors , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[19] José Duato,et al. A new theory of deadlock-free adaptive multicast routing in wormhole networks , 1993, Proceedings of 1993 5th IEEE Symposium on Parallel and Distributed Processing.
[20] Hyun-Wook Jin,et al. Scheduling of MPI-2 one sided operations over InfiniBand , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[21] Katherine A. Yelick,et al. Communication avoiding and overlapping for numerical linear algebra , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.