Topology-oblivious optimization of MPI broadcast algorithms on extreme-scale platforms
暂无分享,去创建一个
[1] Torsten Hoefler,et al. A practically constant-time MPI Broadcast Algorithm for large-scale InfiniBand Clusters with Multicast , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[2] Jelena Pjesivac-Grbovic,et al. Towards Automatic and Adaptive Optimizations of MPI Collective Operations , 2007 .
[3] Alexey L. Lastovetsky,et al. High-Level Topology-Oblivious Optimization of MPI Broadcast Algorithms on Extreme-Scale Platforms , 2014, Euro-Par Workshops.
[4] Alexey L. Lastovetsky,et al. MPIBlib: Benchmarking MPI Communications for Parallel Computing on Homogeneous and Heterogeneous Clusters , 2008, PVM/MPI.
[5] Robert A. van de Geijn,et al. A Pipelined Broadcast for Multidimensional Meshes , 1995, Parallel Process. Lett..
[6] Jesper Larsson Träff,et al. Optimal broadcast for fully connected processor-node networks , 2008, J. Parallel Distributed Comput..
[7] Manjunath Gorentla Venkata,et al. Cheetah: A Framework for Scalable Hierarchical Collective Operations , 2011, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.
[8] Alexey L. Lastovetsky,et al. Hierarchical approach to optimization of parallel matrix multiplication on large-scale platforms , 2015, The Journal of Supercomputing.
[9] Kiril Dichev,et al. Optimization of Collective Communication for Heterogeneous HPC Platforms , 2014, HiPC 2014.
[10] J. Watts,et al. Interprocessor collective communication library (InterCom) , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.
[11] Peter Sanders,et al. A bandwidth latency tradeoff for broadcast and reduction , 2003, Inf. Process. Lett..
[12] Rajeev Thakur,et al. Optimization of Collective Communication Operations in MPICH , 2005, Int. J. High Perform. Comput. Appl..
[13] Dhabaleswar K. Panda,et al. Designing Optimized MPI Broadcast and Allreduce for Many Integrated Core (MIC) InfiniBand Clusters , 2013, 2013 IEEE 21st Annual Symposium on High-Performance Interconnects.
[14] George Bosilca,et al. Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation , 2004, PVM/MPI.
[15] Philip Heidelberger,et al. The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer , 2008, ICS '08.
[16] Amith R. Mamidala,et al. Fast and scalable MPI-level broadcast using InfiniBand's hardware multicast support , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[17] Roger W. Hockney,et al. The Communication Challenge for MPP: Intel Paragon and Meiko CS-2 , 1994, Parallel Computing.
[18] S. Lennart Johnsson,et al. Optimum Broadcasting and Personalized Communication in Hypercubes , 1989, IEEE Trans. Computers.