The design of ultra scalable MPI collective communication on the K computer
暂无分享,去创建一个
S. Sumimoto | Atsuya Uno | K. Miura | F. Shoji | M. Yokokawa | M. Kurokawa | Naoyuki Shida | Tomoya Adachi
[1] Yogish Sabharwal,et al. Optimal bucket algorithms for large MPI collectives on torus interconnects , 2010, ICS '10.
[2] Toshiyuki Shimizu,et al. Tofu: A 6D Mesh/Torus Interconnect for Exascale Computers , 2009, Computer.
[3] George Bosilca,et al. Open MPI: A High-Performance, Heterogeneous MPI , 2006, 2006 IEEE International Conference on Cluster Computing.
[4] Philip Heidelberger,et al. Optimization of MPI collective communication on BlueGene/L systems , 2005, ICS '05.
[5] Rolf Rabenseifner,et al. Optimization of Collective Reduction Operations , 2004, International Conference on Computational Science.
[6] Robert A. van de Geijn,et al. Broadcasting on Meshes with Wormhole Routing , 1996, J. Parallel Distributed Comput..
[7] Robert A. van de Geijn,et al. A Pipelined Broadcast for Multidimensional Meshes , 1995, Parallel Process. Lett..
[8] R. A. van de Geijn,et al. Efficient Global Combine Operations , 1991 .
[9] M. Simmen,et al. Comments on broadcast algorithms for two-dimensional grids , 1991, Parallel Comput..
[10] Yousef Saad,et al. Data communication in parallel architectures , 1989, Parallel Comput..
[11] Tomohiro Inoue,et al. The Tofu Interconnect , 2012, IEEE Micro.
[12] Fumiyoshi Shoji,et al. Implementation and Evaluation of MPI Allreduce on the K Computer , 2011 .
[13] D. G. Payne,et al. Broadcasting on Meshes with Worm-hole Routing , 1996 .