Efficient and Scalable Routing Algorithms for Collective Communication Operations on 2D All-Port Torus Networks
暂无分享,去创建一个
[1] Amith R. Mamidala,et al. MPI Collective Communications on The Blue Gene/P Supercomputer: Algorithms and Optimizations , 2009, 2009 17th IEEE Symposium on High Performance Interconnects.
[2] Gang Liu,et al. Optimal All-to-All Personalized Communication in All-Port Tori , 2006, First International Multi-Symposiums on Computer and Computational Sciences (IMSCCS'06).
[3] Dhabaleswar K. Panda,et al. Hybrid algorithms for complete exchange in 2D meshes , 2001, ICS '96.
[4] Broadcasting and NP-completeness , .
[5] Xiaotong Zhuang,et al. A recursion-based broadcast paradigm in wormhole routed networks , 2005, IEEE Transactions on Parallel and Distributed Systems.
[6] Joseph G. Peters,et al. Circuit-Switched Broadcasting in Torus Networks , 1996, IEEE Trans. Parallel Distributed Syst..
[7] Young-Joo Suh,et al. All-to-All Personalized Communication in Multidimensional Torus and Mesh Networks , 2001, IEEE Trans. Parallel Distributed Syst..
[8] Yu-Chee Tseng,et al. Efficient Broadcasting in Wormhole-Routed Multicomputers: A Network-Partitioning Approach , 1999, IEEE Trans. Parallel Distributed Syst..
[9] Robert A. van de Geijn,et al. A Pipelined Broadcast for Multidimensional Meshes , 1995, Parallel Process. Lett..
[10] Yuanyuan Yang,et al. Near-optimal all-to-all broadcast in multidimensional all-port meshes and tori , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.
[11] Philippe Michallon,et al. Schemas de communications globales dans les reseaux de processeurs : application a la grille torique. (Global communication schemes in processor networks ; application in torus) , 1994 .
[12] Yu-Chee Tseng,et al. Bandwidth-Optimal Complete Exchange on Wormhole-Routed 2D/3D Torus Networks: A Diagonal-Propagation Approach , 1997, IEEE Trans. Parallel Distributed Syst..
[13] Yogish Sabharwal,et al. Optimal bucket algorithms for large MPI collectives on torus interconnects , 2010, ICS '10.
[14] Philip Heidelberger,et al. Optimization of All-to-All Communication on the Blue Gene/L Supercomputer , 2008, 2008 37th International Conference on Parallel Processing.
[15] Kai Hwang,et al. Advanced computer architecture - parallelism, scalability, programmability , 1992 .
[16] Michal Soch,et al. Time-Optimal Gossip of Large Packets in Noncombining 2D Tori and Meshes , 1999, IEEE Trans. Parallel Distributed Syst..
[17] Yu-Chee Tseng. A Dilated-Diagonal-Based Scheme for Broadcast in a Wormhole-Routed 2D Torus , 1997, IEEE Trans. Computers.
[18] Yih-Jia Tsai,et al. Broadcast in all-port wormhole-routed 3D mesh networks using extended dominating sets , 1994, Proceedings of 1994 International Conference on Parallel and Distributed Systems.
[19] Robert A. van de Geijn,et al. Collective communication: theory, practice, and experience , 2007, Concurr. Comput. Pract. Exp..
[20] Yu-Chee Tseng,et al. Toward Optimal Complete Exchange on Wormhole-Routed Tori , 1997, IEEE Trans. Computers.
[21] Young-Joo Suh,et al. All-To-All Communication with Minimum Start-Up Costs in 2D/3D Tori and Meshes , 1998, IEEE Trans. Parallel Distributed Syst..
[22] H. N. Mamadou,et al. A robust dynamic optimization for MPI Alltoall operation , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[23] Denis Trystram,et al. Minimum Depth Arcs-Disjoint Spanning Trees for Broadcasting on Wrap-Around Meshes , 1995, ICPP.